DOCUMENT RESUME. 

ED 387 869 EA 027 057 



AUTHOR 
TITLE 

INSTITUTION 
SPONS AGENCY 
REPORT NO 
PUB DATE 
NOTE 

PUB TYPE 



Burstein, Leigh; And Others 

Validating National Curriculum Indicators. Draft. 

Rand Corp., Santa Monica, Calif. 

National Science Foundation, Arlington, VA. 

DRU-1086-NSF 

Jun 95 

115p. 

Reports - Research/Technical (1A3) — 
Tests/Evaluation Instruments (160) 



EDRS PRICE MF01/PC05 Plus Postage. 

DESCRIPTORS Data Collection; Data Interpretation; Elementary 

Secondary Education; "Evaluation Criteria; Evaluation 
Problems; '"'Instructional Materials; '''Mathematics 
Curriculum; Mathematics Instruction; Mathematics 
Materials; ''National Curriculum; ''National Surveys; 
Research Methodology; ''Validity 

IDENTIFIERS Educational Indicators 



ABSTRACT 

This report summarizes results from research aimed at 
improving the quality of information collected about school 
curriculum. The research sought to design and pilot a model for 
collecting benchmark data on school coursework. These more indepth 
data, such as course textbooks, assignments, exams, and teacher logs, 
can serve as anchors against which the validity of the survey items 
used in national data collections might be assessed. The data provide 
a basis for assessing the extent to which survey items measure what 
is taught in schools and classrooms. They can also be used to monitor 
whether the validity of teachers' responses have been undermined by 
outside factors. Data were derived from a survey of 70 mathematics 
teachers in 9 secondary schools located in California and Washington. 
The survey was administered before and after the collection of 
artifact data. Data were also collected from teacher daily logs, 
assignments, and interviews with principals, counselors, and 
mathematics department chairs. Chapter 2 details the study design, 
and the next three chapters summarize the extent to which major 
dimensions of curriculum can be measured through national surveys and 
then validated through deeper probes in a smaller number of sites. 
The final chapter discusses the implications for the design of future 
curr i cul um- indi cat or systems and for the policy uses of such 
information. It concludes that while an enhanced version of current 
national surveys can provide a reasonably accurate picture of high 
school mathematics teaching across the country, there are significant 
limitations on such data, and at this point, policy uses for more 
than informational purposes would be inappropriate. The study 
represents a first step in ensuring that curriculum indicators are 
valid and reliable measures of instruction. Nine figures, 12 tables, 
copies of the surveys, and a sample daily log form are included. 
(Contains 39 references.) (LMI) 



On 
00 

r~ 

00 

O 



D R 



F T 



RAND 



Validating National 
Curriculum Indicators 



U.S. DGPAATMCNT OF EDUCATION 
OtiK* ol Educ«iK>n«i Research and impiovement 

EDUCATIONAL RESOURCES INFORMATION 

y CENTER (ERIC) 

BThta (Jocumenl hes been reproduced as 

'•ceived Irom the person or orpanization 

ortgirt«|tr)g tt 

O Mir»of ch«r\ges have been made lo improve 
rspfoductton quettty 

• Potnis ol view or optnions staled m ihis docu 
rnent do not necessarily represent otdciai 
OERl position or policy 



Leigh Burstein, Lorraine M. McDonnell, 
Jeannette Van Winkle, Tor Ormseth, Jim 
Mirocha, and Gretchen Guitton 

DRU-1086-NSF 
June 1995 



Prepared for National Science Foundation 



O 
N 

er|c 



The RAND restricted draft series is intended to 
transmit preliminary RAND research. Restricted 
drafts have not been formally reviewed or edited 
and have not been cleared for public release. 
Views or conclusions expressed in the drafts are 
tentative. A draft should not be cited or quoted 
without permission of the author. 



PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERICi ■ 



^^A?,l^" """^'■"A' 'nstitution t\mt seeks to improve public polio/ through research and analysis 
KAND s publications do not necessarily reflect the opinions or p'oliaes of its research sponsors. 

NOT CLEARED FOR OPEN PUBLICATION 



BEST COPY AVAILABLE 



Chapter 1 
INTRODUCTION 

Efforts over the past decade to improve schooling have focused on changing the 
way schools are governed and on altering what they teach. The latter reform has been 
the more difficult to implement because it is multi-faceted-affecting curriculum content, 
teacher training, instructional approaches, and student assessment-and because 
classroom practice has traditionally been that aspect of schooling most insulated from the 
reach of public policy. 

The multiple dimensions of curriculum and its varied manifestations in individual 
schools and classrooms have also meant that information about what students are being 
taught across the country is limited. This shortcoming has become evident as the 
demand for more conqjrehensive indicators, describing the status of U.S. schooling, has 
growa A variety of poUcies such as the articulation of academic standards and new 
forms of student assessment assume that information is widely available on the content 
and modes of instruction. Yet most indicators of curriculum are limited to data 
collected by states on course offerings and enrolhnent, enumerated only by conventional 
course titles, and to national survey data based on student and teacher self-reports about 
course-taking, topic coverage, and instructional strategies. Both forms of data are 
inadequate. Many course titles convey no information about content or how that contem 
is presented. Although the national data from sources such as the National Assessmem 
of Educational Progress (NAEP) and the National Educational Longitudinal Study 
(NELS) are richer, no attempt has been made to determine whether information 
provided by teacher respondents is consistem with actual classroom practices and 
activities. Nor are there any explicit design features built into national indicator efforts 
to monitor whether responses are being corrupted by external events. 

This report summarizes results from research aimed at improving the quaUty of 
information coUected about school curriculum. Its purpose was to design and pDot a 
model for collecting benchmark data on school coursework. These more indepth data, 
such as course textbooks, assigmnents, exams, and teacher logs, can serve as anchors 
against which the validity of the survey items used in national data coUections such as 
NAEP and NELS might be assessed. Together, these data constitute a series of deeper 



ERIC 

ummmmmm 



2 

probes than are possible with survey data. As such, they provide a basis for assessing the 
extent to which survey items tap what is taught in schools and classrooms. They can also 
be used to monitor whether the validity of teachers' responses have been undermined by 
outside factors so that, for example, reports of classroom activities are consistent with 
current reform rhetoric, but are not matched by changes in actual practice. Benchmark 
data are more difficult and costly to coUect, but they do not need to be coUected as often 
or on as large a sample as conventional indicator data. 

Although this research focused specifically on high school mathematics, many of 
its findings about measuring the multiple dimensions of curriculum also apply to other 
academic subjects taught in secondary schools. The study design is detailed in chapter 
two and the three subsequent chapters summarize the extent to which major dimensions 
of curriculum can be measured through national surveys and then vaUdated through " 
deeper probes in a smaller number of sites. A final chapter discusses the impUcations of 
our study for the design of future curriculum indicator systems and for the poUcy uses of 
such information. We conclude that while an enhanced version of currem national 
surveys can provide a reasonably accurate picture of high school mathematics teaching 
across the country, there are significant limitations on such data and at this pomt, policy 

uses for more than informational purposes would be inappropriate. 

Before turning to a description of our research methods, we provide some 

background for the study by discussing the research base on which it draws and the 

relevant policy and practice context 

RESEARCH BASE 

A growing body of research documents the relationship between student 

achievement, the types of courses taken, and the content and level of those courses. 

Some of the most compelling evidence about the relationship between achievemem and 

curricular contem comes from the Second International Mathematics Study (SIMS), 

conducted between 1976 and 1982.^ As their predecessors had recognized some twlnty 



cH,H ^^J^K^^"" coUcction included the adininistration of achievement tests and qucstionaircs to over 125 000 
m the grade-level where the majority would be aged D and PopulaUon B, consisting^of rZ^ub S,e 



ERIC 



years earUer in the First International Mathematics Survey, the SIMS researches 
understood that, in comparing student achievement across different national systems, 
curricular differences had to be taken into consideration. That recognition led to the 
notion of opportunity-to-leam (OIL). OTL became a measure of Vhether or 
not..students have had an opportunity to study a particular topic or learn how to solve a 
particular type of problem presented by the test" (Husen as dted in Burstein, 1993: 
xxxiii). SIMS' researchers conceptualized the mathematics curriculum as functioning at 
three levels: the intended curriadum as articulated by officials at the system or national 
level; the implemented curriadum as interpreted by teachers in individual classrooms; and 
the attained cunicubwt as evidenced by student achievement on standardized tests and by 
their attitudes (Travers, 1993: 4). 

The major vehicle for measuring the implemented curriculum was an OTL 
questionnaire administered to the teachers of tested students. Teachers were asked 
whether the contem needed to respond to items on the achievemem tests had been 
taught to their students. They were also asked more general questions about their 
instructional goals, their attimdes and beliefe about mathematics teaching, their 
instructional strategies, and their professional background. 

One of the most important purposes that SIMS served was to document 
differences in curriculum, and hence in opportunities to learn, across national systems. 
For example, SIMS researchers found striking differences between the ways curricula arc 
organized in the countries where students scored highest on the SIMS tests and the way 
they are organized in the United States. At the lower-secondary level, the Japanese 
curriculum emphasizes algebra; the curricula in France and Belgium are dominated by 
geometry and fractions. In contrast, U.S. schools allocate their curricula more equally 
across a variety of topics-thus covering each subject much more superficially. The 
mathematics curriculmn in U.S. schools is characterized by extensive repetition and 



terminal grade of secondary education who were also studying mathematics In ,Miti^n . 
qu«tionnaire data were coUectcd from the appro^mately 6000 Z^cr^ofZTsJ^nt^^A ^ lT 



ERIC 



4 

review, and little intensity of coverage. This low-intensity coverage means that individual 
topics are treated in only a few class periods, and concepts and topics are quite 
fragmented (McKnight, et aL, 1987). 

The SIMS data also illustrated variations in opportunities to learn v/ithin the same 
national system. For exan^le, in the algebra content area, Japan's OTL ratings were 
quite similar across classrooms and teachers, ringing from 60 to 100 percent coverage ot 
the content included in the SIMS test items, with the median topic coverage 85 percent 
In contrast, the United States' OTL ratings ranged from 0 to 100 percent, with the 
median at 75 percent of the SIMS topics. About 10 percent of the thirteen-year-olds 
tested in the U.S. were receiving virtually no instruction in algebra topics and 25 percent 
were receiving instruction in only about half the mathematics content covered on the 
algebra sub-test (Schmidt, et aL, 1993).^ Part of the reason for the greater variation in 
OTL in the United States is that schools typically assign students to different kinds of 
mathematics classes according to their abilities. Using the SIMS data, Kifer (1993) 
found that when the U.S. eighth grade mathematics classes in the SIMS sample were 
categorized as remedial, regular, enriched, and algebra, significant differences were 
evident in students' opportunities to learn the content tested in SIMS. With the 
exception of arithmetic topics, students in remedial classes receive veiy little teaching on 
mathematics content, and even students in regular classes receive less content coverage 
in algebra and geometry topics than those in enriched and algebra classes. 

The SIMS results visibly influenced public discussion because they showed 
significant gaps in U.S. students' achievement, as compared with students in other 
industrialized countries. But other studies focused solely on the United States produced 
similar findings about the effects of curricular exposure. For example, Raizen and Jones 
(1985) summarized four studies based on nationaUy representative student samples that 
showed a strong correlation between the number of mathematics courses students take 
and their achievement in mathematics. These relationships persist even when 



On the whole, opportunities-to learn were considerably more uniform in France and Japan than in the 
Umtcd SUtcs and New Zealand, although within-system variation is greater for aU systems in geometry than for 
either algebra or arithmetic topics (Schmidt, et aL, 1993). 



ERIC 



5 

background variables such as home and community environment and previous 
mathematics learning are taken into account Research had also shown that the level, as 
well as the number of courses students take is correlated with achievement Jones et al. 
(1986), after controlling for socioeconomic status and test scores two years earlier, found 
that students in the High School and Beyond (HS&B) sample with at least five transcript 
credits in mathematics at or above the algebra I level scored an average of 17 
percentage points higher on a standardized mathematics test than those with no course 
credits in higher-level mathematics. In documenting that curricular exposure is a 
significant predictor of student achievement and a critical factor in influencing the 
distribution of students' learning opportunities, all these studies make a strong case for 
supplementing data on student achievement with information about the curriculum they 
experience. 

POLICY AND PRACTICE CONTEXT 

Growing concern about the achievement of U.S. smdents and the distribution of 
that achievement across different types of students has also prompted an intensified 
focus on school curriculum as a focal point for policy interventions. Beginning in the 
mid-1980s, elected officials, especially at the state-level, extended their traditional 
concern about how schools are governed and financed to include what schools teach, 
who teaches it, and in some cases, how it is taught In fashioning poUdes in this area, 
policymakers drew on the research that demonstrated the close link between sttidents' 
curricular exposure and their achievement, and on expert advice about what constitutes 
an engaging, productive curriculum. 

Recent examples of this focus are the federal Goals 2000 legislation and similar 
standards-setting exercises in the states. Goals 2000 provides grants to states as 
inducements for them to establish curriculum and student performance standards, as weU 
as standards or strategies that ensure students will have an opportunity to learn the 
content embodied in the state standards. Even prior to the federal effort, however, a 
number of states were akeady using curriculum as a reform vehicle by relying on such 
strategies as the developmem of curricular standards and fi-ameworks, the redesign of 
their assessment systems, and other means such as textbook adoption policies. 



ERIC 



6 

These federal and state initiatives have drawn on the prior, standards-setting 
efforts undertaken by professional organizations and have also pron^ted other 
disciplines to begin similar exercises. The National Council of Teachers of Mathematics 
(NCTM, 1989, 1991) was responsible for one of the earliest professional efforts to 
improve classroom practice through the promulgation of curricular and teaching 
standards. Its approach was later reflected in new state curriculum frameworks such as 
those in CMfomia (California Department of Education, 1992). In mathematics, 
curriculum reform has been characterized by learning goals that emphasize 
understanding the concepttial basis of mathematics, reasoning mathematically and 
applying that reasoning m everyday situations, offering alternative solutions to problems, 
and communicating about mathematical concepts m meaningful and useful ways. 
Consistent with those goals, curriculum reformers have advocated changes m both 
mathematics content and instructional strategies. Particularly prominent in tiiis reform 
vision of tiie matiiematics curriculum is a changed view of tiie teacher's role. Because 
students are expected to play an active part in constiaicting and applying matiiematical 
ideas, teachers are to be facihtators of learning ratiier tiian imparters of information. In 
terms of actual instnictional activities, tiiis shift means tiiat ratiier tiian lecturing and 
relying on a textbook, teachers are to select and stiiictiire matiiematical tasks tiiat allow 
students to leam tiirough discussion, group activities, and otiier modes of discoveiy. 

Despite its growing popularity, the use of curriculum as a lever for educational 
reform is not witiiout its problems. Most of tiie attention tiius far has focused on tiie 
political difficulties inherent in defining what should be included m state curriculum 
standards. The recent experience of sta.tes like California, Kentucky, and Pennsylvania 
where serious controversies have erupted over tiie content of state curriculum 
frameworks and assessments, illustrate tiie passion tiiat questions about what students 
should be taught can engender (Merl, 1994; Harp, 1994; Ravitch, 1995). Debate has also 
erupted over tiie use of OTL standards as part of a curricular reform strategy, witii tiie 
controversy focused on values such as how equity is defined or tiie appropriate role of 
state vs. local government (National Council on Education Standards and Testing, 1992; 
O'Day and Smitii, 1993; Rotiiman, 1993; Owens, 1994; Goodhng, 1994). 

ErIc 6 



^ 

Equally important, however, are the technical feasibility issues that arise when 
curriculum is used as the focus of policy. One major problem stems from limitations on 
the amount and type of indicator data currently collected by the federal government and 
the states. Statistical data about the condition of schooling focused historically on inputs 
such as per pupil spending and on outcomes, most notably student test scores. 
Information about how schools are organized and how students are taught tended to be 
avaUable only through research studies that were based on data coUected from limited 
samples on a non-routine basis. 

However, beginning in the mid-1980s, a number of researchers and policymakers 
began to advocate expanding the type of indicator data that was routinely collected and 
reported (Mumane and Raizen, 1988; Shavelson, et aL, 1987; 1989; OERI State 
Accountability Study Group, 1988; National Study Panel on Education Indicators, 1991; 
Porter 1991). They argued that indicator data on school and classroom processes were 
necessary to monitor educational trends, compare schooling conditions across different 
kinds of students in differem geographic locations, and to generate information that 
could be used in holding schools accountable. A good part of the rationale for coUecting 
more than just input and outcome data lie in the fact that these indicators were to be 
used for policy purposes. Knowing that educational conditions were getting better or 
worse provided litUe insight into why particular trends existed or how to fix problems or 
replicate successes. Furthermore, it had become clear that the way in which educational 
inputs were used was as important as the absolute level of those resources. To 
accommodate the information needs of poUcymakers, then, indicator systems had to 
include data that could provide a comprehensive picture of the schooling process as it 
occurred in schools and classrooms. 

Consequently, proposed designs for new indicator systems advocated including 
process measures such as teacher background and experience, school- and grade-level 
organization, course offerings and student course-taking patterns, curriculum content, 
instructional materials avaUability and usage, and instructional strategies. In 
recommending that a broad array of school and classroom process measures be included 
in indicator systems, researchers drew upon studies documenting the relationship 

erJc ^ 



8 

between student achievement and the type of instruction they receive (Shavelson, et al., 
1989). 

Some indicator systems were ejqpanded to include school process data. For 
exanq)le, at the national level, NAEP and the longitudinal surveys of students sponsored 
by the National Center for Education Statistics (NCES) (e.g., HS&B, the NELS) 
surveyed students, teachers, and school administrators about school organization and 
resources, teacher qualifications, curricular content, and instructional strategies. These 
data could be disaggregated by gender, ethnicity, urbanidty, and in some cases, by state. 
In addition, in mathematics and science, 47 states were reporting data to the Council of 
Chief State School Officers on teacher qualifications and student course-taking patterns 
(Blank and Gruebel, 1993). By the late 1980s, in about half the states, data were 
available about school-level performance (eg., student test scores, attendance, and drop- 
out rates) and some states like California issued "school report cards" which included 
school process data such as the proportion of students taking college preparatory or 
Advanced Placement courses (OERI State Accountability Study Group, 1988). Although 
not typically reported by school, many states also collect information about teacher 
qualifications that could be disaggregated to the school-level. No one, however, is 
collecting data on the curricular content and instructional strategies available to students 
in different local jurisdictions. At this point, it is possible to describe how students' 
curricular opportunities differ for boys vs. girls, for different ethnic groups, and for urban 
students as compared with those in either rural or suburban areas. But we do not know 
whether the curriculum experienced by students in Seattle is significantly different than 
what students in Indianapolis or Pikeville, Kentuclq' experience, or whether curriculum 
differs greatly among schools within the same state. 

Besides these limits on the amount and type of curriculum indicator data, there 
are substantial methodological problems with the available data. The most common 
data, available on a school-by-school basis, is derived from reports by principals and 
other administrators about course offerings and student enrollment in those courses. 
However, the SIMS data suggest that because of significant variation in the breadth and 
depth of topic coverage, knowing that most ninth graders take algebra does not provide 

ErJc I U 



adequate information about their actual opportunity to learn algebra content 

Even the more comprehensive data about classroom processes, collected firom 
nationally-representative san^les of teachers, are limited in their ability to portray a 
valid picture of the schooling process. Most curriculum data are collected through 
teacher siuveys because these are cost-effective and inQX>se only a modest time burden 
on respondents. However, some aspects of auricular practice simply cannot be 
measured without actually going into the classroom and observing the interactions 
between a teacher and students. These include: discourse practices that evidence the 
extent of student participation and their role in the learning process, the use of small 
group work, and the relative emphasis placed on different topics within a lesson and the 
coherence of teachers' presentations. Given the rudimentary status of curriculum data in 
most national and state indicator systems, efforts to obtai i an accurate picture of how 
opportunities to learn vary for different groups of students will most likely continue to 
focus at a more general level than these finefy-grained aspects of instruction. 

If policymakers and the public are interested in data about school curriculum that 
are both comparable across local jurisdictions and can be disaggregated to the school- 
level, teacher surveys will remain the most feasible way to collect such information for 
the foreseeable future. Yet, to this point, none of the national survey data collected 
from teachers has been validated to determine whether it measures what is actually 
occurring in classrooms. Despite major advances in the design of background and school 
process measures, studies have generally developed a few new items and then "borrowed" 
others from earlier studies. little effort has been made to validate these measures by 
comparing the information they generate with that obtained through alternative measures 
and data collection procedures. For example, are teachers' reports of curricular goals or 
content coverage consistent with the material tested and the types of questions asked on 
their exams? 

Giren the complexity of the teaching and learning process; the amount of 
variation across classrooms, as evidenced from more indepth, school-based research; and 
shifting modes of instruction as new curricular reforms are introduced, it is reasonable to 
assume that surveys alone may not adequately measure even the most generic forms of 



10 

instructional prsxTtice. Therefore, if national teacher surveys are to remain the major 
source of information about the instruction American students are receiving and if policy 
decisions continue to be made based on these data, mechanisms will need to be 
established to validate the siirvey data. The benchmarking strategy which relies on other 
data such as textbooks and teacher assignments, that is outlined in this study, is one 
method for improving the quality of national curriculum indicator data. 

Past research on the relationship between student achievement and the instruction 
they receive, as well as the growing euqphasis on curriculum as a policy lever, suggest 
several factors that need to be considered in efforts to improve the quality of curriculum 
indicator data-whether they be based solely on surveys or also include more indepth 
validation procedures. First, curriculum is a multi-^iimensional concept that includes, but 
is not limited, to the content of instruction. Consequent^, in addition to content or topic 
coverage, information also needs to be collected on several other dimensions. An 
ol vious one is teachers' instructional strategies. Key elements include the manner in 
which content is sequenced and the mode in which teachers and textbooks present it to 
students. For example, the effect on student learning might be quite different if a 
teacher presents new content through a lecture than if she introduces students to the 
same content by asking them to apply previously-learned concepts to a new situation and 
has them do it while working in small groups. Another critical dimension of curriculum 
are the goals that teachers pursue as they present course content to students and use 
various instructional strategies. The relative emphasis that teachers give to different 
objectives reveals something about their expectations for a particular course, and their 
choice of objectives is likely to influence how they configure topics and instructional 
activities within that course. However, teachers' reports of their course objectives reflect 
intended behavior and are likely to be less reliable than reports of actual behavior, such 
as topic coverage md instructional activities. For that reason, data on teacher goals can 
be suggestive, but they need to be interpreted in tandem with other information about 
classroom activities. 

Second, curriculum indicators need to capture the variability inherent in a 
complex activity -^uch as teaching. We have noted that data on course enroUments alone 



ERIC 



11 

are insuOdent because they convey little infonnauon about the actual content of the 
course and even less about the instructional strategies used. Similarly, because of the 
current flux in instructional policy and practice, data collection instruments need to 
measure both traditional forms of instruction and the newer approaches advocated by 
curriculum reformers. Strategies such as having students work in small groups to find 
joint solutions or use naanipulatives to demonstrate a concept are currently much in 
vogue among reformers. Yet decades of research on educational change, and most 
recently on the implementation of curriculum reforms (eg., Cohen and Peterson, 1990), 
suggests that many teachers will continue to use more traditional approaches such as 
lecturing to their students and having them work exercises fi-om a textbook. Therefore, 
data collection instruments need to be broadly-focused and sensitive enough to reflect 
the diversity of classroom practice during a transitional period in school curriculum. 

In the next chapter, we outline our study methods and indicate how we attempted 
to take past research and the current context into consideration in designing this study. 



12 

Chapter 2 
STUDY METHODS 
GENERAL APPROACH AND RATIONALE 

The benchmarking procedures developed in this study were designed as one way 
to validate survey data collected from classroom teachers. However, a number of 
different approaches could be used to ensure that surv^ items accurately measure what 
is happening in classrooms. In choosing among possible strategies, two criteria need to 
be considered. Any validation strategy should measure curricular goals, content, and 
instructional activities as sensitively as possible, but it must also do so cost-efficiently 
without imposing a significant burden on teachers and students. 

The methodology of teaching and learning research would suggest that detailed 
classroom observations are the best way to make inferences about the curriculum 
students are actoally receiving (for a detailed discussion of various approaches to 
teaching and learning research, see Wittrock, 1986). However, from a national indicators 
perspective, this approach is problematic 

Although classroom observation is an effective method for capturing curricular 
depth, it is considerably less efficient in measuring breadth~a requirement for indicator 
purposes. For example, if one's purpose were to focus intently on a narrow slice of 
curriculum (e.g., the teaching of the Pythagorean Theorem) taught at a prescribed point 
in most classrooms of a given course, then one could target a specific amount of 
observation time to capture the teaching of that topic, and comparing survey responses 
with observational data would presumably be straightforward. But for most purposes, 
the span of curriculum to be measured through indicator data is much more extensive, 
and the sequencing of topics and time aUocations vary considerably from section to 
section of even the same course, much less across courses. It may well be that 
instruction on certain topics cycles throughout a course, making the targeting of 
observation even more impractical. Choosing a fixed time of the school year to conduct 
observations and capture whatever topics might be taught at that time runs the risk of 
misspecifying the place within a specific teacher's curriculum the observed topic falls, and 
missing what was covered previously and planned for later. Consequently, the only kinds 



13 

of survey questions that could be validated in this way would be general ones dealing 
with activities and process, as distinct from content 

Such limitations led us to conclude that on cost and feasibility grounds alone, 
classroom observation was not a viable tool for obtaining ongoing benchmark data. 
While it is an appropriate and necessary strategy for basic research on school curriculum, 
classroom observation is not practical for education indicator purposes. Moreover, 
unless observations were long-term and extensive, they could very well distort decisions 
about the validity of specific surv^ alternatives. 

Consequently, we decided to build on prior research (McDonnell et aL, 1990) and 
to make the collection and analysis of a representative sample of teacher assignments 
(homework, quizzes, classroom exercises, projects, examinations), gathered throughout 
the semester, the centerpiece of the benchmarking effort We bcUeve that these 
examples of classwork and how the teacher uses them represent much of the curriculum 
as experienced by students. Thus these systematic artifacts of learning, placed within the 
context of syllabi and textbook coverage, constitute a solid basis for characterizing the 
implemented curriculum presented to students.' In addition, by spreading data 
collection over a broader period of time, at a much lower cost than equivalent 
observational activities, the span of curriculum that can be measured is expanded 
considerably. 

The approach taken in this smdy, then, was to use these instructional artifacts as 
deeper probes about the nature of instruction in a small number of sites. The artifacts 
were coded to extract data about teachers' instructional content, activities, and goals. 
That information was then compared with their responses on surveys similar to those 



However, these artifacts do not provide information about how students receive and respond to the 

^^T"^ T ^ ""^ ^'^^ ^^'^^ assignment provided, 

to mdude two samples of student work graded as an A, two B/Cs «.d two examples of belol^rk. 

E?'^"' f ^^""'^ ^r^igcmentTwere made to have theS 

work copied for teachers), and most of the non-responsc rate for the study was accounted for by this request 
Therefore, we did not request student work from the remainder of the teachers in the sample 
-Hont f ^"i' '^""T ^ burdensome and intrusive as more schools 

smint ^? "^^"^ ^^'"''^^ P'«'««i°-' storage o1 



ERIC 



BEST COPY AVaILaL^lL 



.1 o 



14 

administered as part of national data collection efforts/ The overarching question was 
whether measures of goals, activities, and content from the survey cohered or were 
correlated with similar measures obtained from the benchmark data. To the extent that 
inconsistencies emerged, we needed to analyze wiry and to identify ways to improve 
coherence in future indicator data. The results of this endeavor are threefold: an 
analysis of how well survey data measure curriculum, as con^ared with data that are 
closer to the actual instructional process; a recommended set of procedures for 
periodically validating data collected from large-scale surveys; and suggested 
enhancements in the type and number of items included on these surveys. 
STUDY SAMPLE 

The study is based on data collected from a sample of 70 teachers who con^)rise 
the majority of the mathematics faculty in nine secondaiy schools located in California 
and Washington.' The characteristics of the schools and the teachers are summarized in 
Table 2.1. Although the mdepth and eiqploratoiy nature of the data collection meant 
that only a small sample of teachers could be studied, we wanted to make certain that 
they were typical of those who participate in large, national surveys. Therefore, schools 
were selected from among those that were part of the 1992 NELS Second Follow-up 
Study (NELS-SFU).' Twenty-four schools were contacted and nine agreed to participate. 



artifacts were coded by six oqwrienced m a t h e ma t ics teachers and two project stafi, using a coding 
instrument that paralleled the items on the survey. The coding process is described in a subsequent section. 

^In addition to the mathemi»tics teachers from whom daU were coDected, 18 science teachers from seven 
of the sampled schools also participated in the study as part of an exploratory a&alysis focused on developing 
curriculum indicators for high school science courses. However, this report is based on only the data collected 
from the mathematics teachers. 

Each participating teacher was paid an honorium of $175 to complete two surveys and provide 
mstructional artifacts over the course of a semester. The 13 teachers who participated in follow-up mterviews 
were paid an additional $S0. 

^Because NELS was designed to obtain data on a nationally-representative sample of students, teachers were 
mduded only if they Uught students in that sample. Therefore, the 2606 mathematics teachers who were 
surveyed m NELS-SFU do not constitute a nationaUy-represcnUtive smaple of high school mathematics teachers. 
However, just to show how our much smaller sample compares with a larger one drawn from across the country 
we compared our teachers with the NELS-SFU sample and found that the mean years of teachmg experiencx^ 
IS exactly the same for the two groups. Our sample has a slightly higher proportion of males (58 percent as 
compared with 52 percent for NELS-SFU), but the major difference between the two groups is that our sample 



Table 2.1 
STUDY SAMPLE 



School Characteristics (N=9) Number 
California 

Uiban 4 

Suburban 1 

Rural 1 
Washington 

Urban 1 

Suburban 2 

Mathematics classes in each of the course categories examined: 

Below Algebra I 20 

Algebra I 15 

Geometry 12 

Algebra H/Trigonometry g 

Math Analysis/Pre-Calculus 7 

Calculus g 

Teacher Characteristics (N=70)* 

Percent 

Male 58 

Female 42 

College major in mathematics 47 

Mean years of teachmg experience 17 (S.D.=9) 

♦74 teachers agreed to participate in the study, but four dropped-out before the artifact data 
collection was completed. 



i V 

^atiMMIfglliiiMMiMiHnittittli^^ 



15 

Of the remaining, nine refused to participate. The others agreed to participate, but they 
were eliminated for various reasons such as the small size of the mathematics faculty in 
several schools and year-round schedules that did not coincide with our data collection 
timetable. 

Of the nine schools, five are located in urban areas, three are suburban, and one 
is rural The largest school enrolls 2800 students, but five schools have enrollments in 
excess of 2000. The smallest school enrolls 980 students. The enrollment in five of the 
schools is 65 percent or more Anglo, the other four have minority enrollments of 
65 percent or more. 
DATA 

Table 22 summarizes the types of data collected and the purpose each data 
source served in the study design. All the data are discussed at greater length in this 
section. 

Teacher Sorvqrs 

Three factors shaped the design and administration of the survey component of 
the study. First, because the purpose of the project was to validate data collected as part 
of efforts such as NELS, the survey instrument needed to approximate closely the type 
administered in national surveys. Second, we had to make certain that the collection of 
artifact data did not bias teachers' sur responses by sensitizing them to the kinds of 
questions that would later be asked of them on the survey.^ Finally, we wanted to pilot 
the administration of a more extensive survey than has typically been used in national 
indicator data collection. 



includes a considerably lower proportion of teachers with a college major in mathematics (47 percent as 
compared with 70 percent in the NELS-SFU sample). 

^ Our concerns about artifact data collection contaminating survey responses were two-fold. The first 
was that if teachers were completing daily logs and providing assignments throu^out the semester, they might 
become more aware of the types and frequency of their classroom activities than they would ordinarily be 
Consequently, their survey responses would be more accurate than would be the case in routine data coUcction 
when teachers only complete a survey. If that were the case, the survey responses in our study would not be 
equivalent to those collected in national indicator efforts. Second, we were concerned that because of their direct 
contact with members of the research team throughout the semester, teachers might be more likely to give what 
they a»nsidcrcd to be socially desirable responses. In this case, those responses were likely to be consistent with 
the rhetonc of the mathematics reform movement and away from more traditional teaching strategies. 

ErJc .1 O 



■3 

Q 

o 



-a 

■SI 



e 



= * 

2 S • c 

u «J «j S 

> *- 

O C « "r; 

O C 3 J< 

as « 

3-S § « 

o >»'5 

^ u S c 

09 > c a 

U5 3 C 2 

C CA O ^ 

ea § ~ C 

4J ."O O 

*- U5 C — 

c.Ji « « 

S Si a «i 

O is o o 



o 
u 

u 
a 

I 



2 
u 

« 2 

22 
5 y 



4> 'P 

-3 a 

-co ^ 

U 
^ g 



73 

u 



1 



3 



T3 
U 

Is 

•si 

o .ts 

a » 

o . 
2 § 5 



S CO 

o o 

ea ea c 

*^ "3 e« 

u c e 

> o o 



CO 

e 
u 

X M £ 
8- 4> 

o 5" 

c as 



c 
'3 

O 



a. 

ca 



CA 



e 
o 

U 

"o 

u 
ea 
ea 
T3 

I 
1 

u 

CA 

'S 



T3 
ea 

>» 
u 

3 
CO 



o. 

e 
o 

ea 
. ea 

T3 

>< 
U 



i2 



K 

U 

H 
u 

CA 
Ul 

9 
O 



o 



e 
o 

CA 
Ul 

U 

ea 

« 

s 

CA S 

V c 

(A O 

U M 

CA U 

cue 

o ^ 



CA 

u 

CA 

00 



•3 
Q 

Ul 

U 

u 
ea 
u 
H 



2-^ 
.5 u 



''^ ea 



oo§ 

(30 S 

ea E 
>-• C 
o o 
> « 

O C 
u C 

.a S 

O -Sj 
C 
e u 
o E 

S B 

3.S? 

•O CO 

S > 

<> CA 

!2 00 



CA 



e 

e 
op 

CA 
CA 



•3 
Q 



CA 
CA 

U 

a 

•T3 



CA 

B 

ea 
X 



u 

CA 

u 



e 



•T3 

e 
ea 



e 

§ 
e 
bp 
'co 

(A 



O 



_ 6 

Si "S 

CA 

4^ 



ea 



ea 



CA 

e 

1^ 
a 

fS 

(A 
3 
O 

r3 

B 
o 
c 

ea 



e 



o 

w 
c 
o 

u 



1" 



'E 

3 
O 
u 
00 

n. 



IS 

s •£ 
.2 c 

-= u 

O es 

a 'o 



^ i 

in 



■r ail* 

« O. 

c E 

o 

a " 

I" 

a 

31 S» 



• « 



ERIC 



16 

Our strategy for takmg these factors into consideration was to administer a survey 
prior to collecting the artifact data. It included the same items as those in the 
instructional activities, content, goals, and teacher background sections of the teacher 
questionnaire administered as part of the 1992 NELS-SFU. Teachers were asked to 
respond in terms of one particular section of a single course that they were teaching. 
We then collected artifact data on that same section over one semester. At the end of 
the semester, we adminis tered a second survey that repeated the same instructional 
activities and content items asked on the first survey, but also included an expanded list 
of topics, goals, and mstructional activities. The NELS-SFU survey contained 11 topics 
to measure content coverage, 16 items on instructional strategies, and ten on goals and 
objectives. In contrast, the survey administered after the artifact data collection included 
30 topics to measure content coverage (with separate topic lists for courses at or above 
algebra H and another for courses below that level), 33 items on instructional strategies, 
and 32 on goals and objectives. The enhanced survey also included items designed to 
measure teachers' expectations about levels of student understanding and how teachers 
conceive of their role in student learning. Appendix A contains copies of both 
questionnaires. 

In addition to expanding the post-data coUection survey to probe in greater depth 
and to measure curriculum in more diverse ways, we also experimented with a variety of 
different item formats and response options. For example, the NELS survey asks 
teachers whether a topic was taught previously, reviewed only, taught as new content, wiU 
be taught or reviewed later in the year, or whether the topic is beyond the scope of the 
course or not included in the curriculum. In addition to this response option, the 
enhanced survey also asked about the number of periods spent on a topic, using a 
response option that included six categories ranging from 0 periods to > 20 periods. In 
some questions, teachers were asked to describe characteristics of their instructional 
activities in terms of the percentage of class time or of an assignment; responses were 
elicited in some cases as a continuous variable and in others, as a categorical variable. 
In other questions, frequency was defined as a categorical measure ranging from almost 



17 

every day to never. Similarly, teachers were asked about the amount of emphasis they give 
to different goals, but they were also asked more indirectly about curricular goals in a 
question that probed their chelations for students' level of understanding. Including a 
variety of different types of response options provided us with another source of 
information from which to make recommendations about how to inq)rove existing 
surveys. 

Analysis of the two surveys suggests that teachers' responses were not biased by 
the artifact data collection, and that validation procedures can be designed to occur after 
suiv^ data have been collected. When we conq)ared teachers' responses to the two 
surveys, we found few significant differences between their responses on items that 
appeared on both the pre- and post-survey. On average across all items common to both 
surveys, 90 percent differed by no more than one response option and 60 percent were 
exactly the same on the two surveys. Those items where a large proportion of responses 
changed were ones that would be expected to change between the beginning and end of 
the semester because teachers have more precise information at the end-e.g., the 
percent of class time spent administering tests or quizzes, the frequency of teacher-led 
discussions. In addition, there was no evidence that teachers gave socially desirable 
responses, or felt it necessary to present an image of their teaching consistent with the 
rhetoric of the mathematics reform movement As the discussion in the subsequent 
chapters will indicate, a large proportion of teachers reported engaging in traditional 
activities such as lecturing and correcting or reviewing homework on a daily basis, and 
most reported engaging in reform-oriented activities such as student-led discussions 
rarely or not at all. 
Instmctional Artifacts 

Course textbooks. A copy of the textbook used by each teacher in the study 
sample was purchased, and teachers were asked on the post-data collection survey which 
chapters they had covered over the course of the semester and which additional ones 



18 

they had alreatfy or planned to cover during the rest of the year * All the chapters or 
lessons a teacher reported as covering were then coded to determine which topics were 
covered. That information became one of the benchmarks against which topic coverage, 
as reported by teachers on the smvQr, was con^ared. 

Teacher daily logs. During the same five weeks that all their daily assignments 
were collected, teachers were also asked to conq)lete a one-page log form (included in 
Appendix A) at the end of each day. The form asked them to list which topics they 
covered during that day's class period and to indicate on a checklist all the modes of 
instruction they used and the activities in which the students engaged. There was also a 
comments section where teachers were asked to provide aity information about the lesson 
that they felt was important (eg., that class time was reduced by other school activities, 
that something particularly different or innovative occurred that day). In order to 
minimize teacher burden, the log form was designed to be completed in approximately 
five minutes. 

Because the logs were completed by teachers, they do not represent an external 
source for validating the surveys in the same way that textbooks and assignments do. 
However, they do provide a check on the reUability of the surveys since they provide 
greater detail about classroom activities, with the information collected closer in time to 
the actual events. 

Assignments. Teachers were asked to provide copies of every assignment they 
gave to students for a period of five weeks. The five weeks of data coUection were 
divided into one week at the beginning of the semester, three consecutive weeks in the 
middle, and one week at the end. During these times, teachers provided aU in-class and 
homework assignments, quizzes, exams, major projects, and any other written work 
assigned to students. In addition, teachers completed a pre-printed label, checking the 



fnr . .f°"' «'f^*«<H**°°*'^''^«»^'^ Two tca(imtcrtctivcmathcmati<swhichisanaltcmativc method 
^ohf^^^ r''" «°i8comctry that combines the two subjects «nd integrates mdividual topics S a 
p^oblem-solvmg focus. The other two teach Math A-B which is a course offerSl. California sA^k fH^ 

nTiS^r'^talTr^^"'^'«°"-^""^°^^ Consequently, this daU source 



ERIC 



19 

major purpose of an assignment, its relationship to other classwork, whether the work 
was done individually or in groups, and whether done inside or outside the classroom. 
This label was afSxed to each assignment During the remaining weeks of the semester, 
teachers provided copies of their major assignments onIy~Le^ exams, papers of more 
than three pages, and projects. A pre-printed label was also attached to each of these 
assignments. On average, 20 assignments, including major assignments and projects, 
were provided by each participating teacher (n= 1407). Exams and quizzes averaged 
about five per teacher (n=368). 
Interviews 

In each school, we conducted facc-to-face interviews with the principal, the head 
coimselor, and the mathematics department chair. These interviews averaged about 45 
minutes, and focused on the type of students attending the school, the different levels of 
courses offered, what criteria the school used in assigning students to different 
mathematics courses and sections, and how decisions about teacher assignments were 
made. We also asked the department chairs to describe in some detail the major 
differences among the mathematics courses offered by the school in terms of level of 
difficulty, types of students enrolled, topics covered, mstructional materials and strategies, 
course requirements, and grading practices. These interviews helped us place the survey 
and artifact data m a richer and more valid context We were particularly mterested in 
finding out whether there were any recent school- or department-level initiatives that 
might be shaping the curricular content or instructional approaches used by teachers. 

We had not planned to conduct any follow-up interviews with teachers after the 
artifact data were collected. However, we were having difficulty interpreting several key 
findings that showed a lack of internal consistency between what teachere reported on 
the survey as their goals and what they reported about instructional activities. We found, 
for example, that a substantial proportion (40 percent) of teachers were reporting a 
major or moderate emphasis on most of the goals consistent with the mathematics reform 
movement However, only a smaU proportion (12 percent) reported engaging regularly 
in most of the instructional activities advocated by j CTM. Similarly, the mean level of 
agreement between teachers' self-reports about their goals on the surveys and the coding 



20 

of their exams was low. The typical pattern was for teachers to report a minor or 
moderate emphms on most goals, while coders judged teachers* exams as showing no 
en^hasis on those goals. The discrepancy was greatest on the so-called "reform" goals 
and considerably less on more traditional goals (e.g., performing calculations with speed 
and accuracy). 

Before we concluded that these discrepancies represented "real" differences 
between teachers* reported and actual behaviors, we wanted to make certain that they 
were not the result of fundamentally different understandings between teachers* 
interpretation of survey items and the coders who were using the reform movement's 
definitions. As a result, we decided to address these questions through the use of follow- 
up, group interviews. We interviewed all the original study participants from two high 
schools in several group discussions that lasted about 90 minutes eacL We asked 
teachers questions that would help us clarify our anomalous results. For example, with 
regard to the instructional goals that seemed to have been inteipreted inconsistently, we 
asked: "in the course you reported on in your survey, what types of instructioEal activities 
do you see as representing this particular goal?" We report the results of these group 
interviews in subsequent chapters as one basis for interpreting some of our findings. 

The study data were coUected in four waves. We initiaUy coUected data from 
teachers in two schools in the spring of 1992, as a pilot for the rest of the study. We 
found no substantive problems with our data collection instruments and procedures, but 
we needed to streamline them to reduce teacher burden. Consequently, the request fcr 
graded student work was eliminated and the enhanced teacher survey was shortened. 
We also wanted to make certain that there would be no significant differences between 
coUecting data in the fall, as compared with the spring semester. Consequently, we 
collected data from three additional schools in faU 1992 and from the remaining four in 
spring 1993. The follow-up group interviews were conducted in March 1994. 
CODING THE ARTIFACT DATA 

The effectiveness of a validation strategy, based on instructional artifacts, rests 
entirely on how information is coded or extracted from those artifacts. Valid and 
reliable coding requires that three criteria be met First, in order to make comparisons 



ERIC 



21 

between the survey and the artifacts, the coding format needs to parallel the survey items 
as closely as possible. However, valid comparisons depend on more than just a similar 
format for the two types of data. The sarvey items and the coding categories should be 
so clearly defined that teachers and coders will interpret them similarly. Second, the 
coding should extrart as much information as possible from the artifacts so as to provide 
a full, valid description of a teacher's instruction, but it needs to do so without requiring 
judgments or inferences that go beyond the data. Third, the data need to be coded 
reliably-i.e., another coder would make similar judgments about the same information. 

Several factors work against meeting these criteria, however. First, artifact data 
are unstandardized in the sense that the type and mix of assignments can vary 
considerably aaoss teachers. Even the textbooks in our sample, the most standardized 
type of artifact, varied from the conventional (eg., Doldani's Algebra I text) to the 
innovative (e.g., Sunbunt Geometry, Merrill's integrated math series) to the controversial 
(the Saxon series). Second, while some dimensions of curriculum have commonly- 
understood meanings, others do not For exanq)le, most mathematics teachers would 
agree on what content falls within the categories of square roots, quadratic equations, or 
slope. But topics such as math modelirtg or proportional reasoning may be interpreted 
quite differently by different teachers. As we found in our analysis, the problem is 
particularly acute for curricular terms associated with the mathematics reform movement 

Third, coding a given teacher's artifacts requires a large number of judgments, 
some of which may require inferences that go beyond the available data. Although 
textbooks only need to be coded for topic coverage, other artifacts have to be coded to 
extract information on topics, instructional charaaeristics of the exam or assignment, 
level of understanding required of students, and teachers' instructional goals. Depending 
on the degree of aggregation desired, coding judgments can be made across aU artifacts 
of a given type (e.g., across all assignments); with each separate exam or assignment as 
the unit of analysis; or at the most disaggregated level, on an item-by-item basis within a 



2o 



22 

given assignment or exam.' In addition to the sheer number of judgments, coding 
artifacts also requires a variety of different kinds of judgments. In some cases, it only 
involves matching a textbook lesson, assignment, or exam item to one of the topics on 
the survey list But other coding tasks require more complex judgments-e.g., identifying 
types of exam and assignment formats, making inferences about the purpose of an 
assignment or about a teacher's instructional goals. The number and variety of 
judgments involved in coding a teacher's artifacts provide considerable detail about the 
nature of his or her instruction, and expmd the number of bcndmiarks available to 
validate the survey results. The downside is that the greater the number and the variety 
of judgments that coders have to make, the more difficult it is to ensure an adequate 
level of reliability. 

We addressed these constraints on valid and reliable coding by using six 
experienced, secondary mathematics teachers as coders. Project staff trained them for 
two days and the coders were then supervised by two project staff who are also 
experienced mathematics teachers. During the two days, coders familiarized themselves 
with the coding manual and sample sets of artifacts. They also did practice coding, 
followed by an extended discussion and refinement of the coding rules. The first artifact 
file took each coder about one day (approximately 7 hours) to complete, but the amount 
of time was reduced to about 2-4 hours per file once coders became more experienced. 

About 15 percent (n=ll) of the artifact files were double-coded by project staff 
for reliability purposes. The rate of consistency between coders varied somewhat across 
the types of artifacts. For textbooks, coders had a rate of agreement of 58 percent on 
the exact number of lessons that included a particular topic, 74 percent of their 
judgments about topic inclusion differed by only one lesson, and 85 percent were within 



'in our coding, wc chose an approach that falls somewhere in the middle of these three options. Topic 
coverage, level of understanding, and assignment characteristics were coded for «««.-gnn,^nt^ (homework, m-dass 
cwroses, quizzes) at the level of the individual assignment However, coders were asked to make summary 
jud^ents about teachers' goals as they were evidenced across aU their assignments (Le, one judgment based 
on their approxunately 20 assignmenU). For exams, the coding was done at a finer level of detail with level of 
understandmg coded for each individual item or question on an exam, and mstructional goals for each separate 
exam. Qoser attenUon was paid to exams because we feU that whUe both assignments and exams repres<mt the 
enacted curriculum, exams communicate what teachers consider to be most important 



23 

two lessons. On assignments, the coders had a rate of agreement of 74 percent on all 
their judgments about topic inclusion, instructional characteristics, and goals; they 
diiffered by only one category for 86 percent of the judgments they made, and were 
within two categories on 91 percent The rates for exams were 71, 78, and 81 percent 
respectively for the three levels of agreement 

Although these rates of agreement are reasonable, given the nature of the task, 
two caveats are in order. First these aggregate rates of agreement mask the large 
number of judgments that coders had to make. For example, for each assignment, 
coders were making 30 different judgments; a number then multiplied across the 
approximately 20 assignments each teacher provided. For exams, the number of separate 
judgments was 51, multipUed by the 5 or so exams from each teacher. A second caveat 
points to what became an important factor in interpreting some of our substantive 
results~viz^ that the aggregated rates mask considerable variation across types of 
judgments. On some items, the rates of agreement between coders were close to 100 
percent and in other instances, they feU below 50 percent The items with the highest 
rates of agreement tended to be the more specific narrower content topics (e.g., complex 
numbers) and traditional instructional approaches and goals (e.g^ proportion of exam 
items that are multiple choice, proportion that are minor variations of homework 
problems). Those with the least agreement were either broad topic categories or more 
reform-oriented topics and approaches (e.g.. patterns and fimctions, problems [having] 
more than one possible answer). Although our coders were experienced teachers, 
conversant with the NCHTM standards, and trained in a common set of decision rules, 
their lack of agreement evidenced some of the same confusion about terms that was 
reflected in teachers' responses. As a result these coding problems helped inform our 
substantive findings and recommendations for improving fiiturc data coUection. 

In the next three chapters, we summarise major findings, focusing first on 
instructional content, then instructional strategies and finally, on instructional goals. In 
each chapter, we provide examples of the kinds of information about curriculum that can 
be obtained from teacher surveys. We then examine the level of consistency between 
survey responses and the artifacts, identify reasons for discrepancies, and suggest how 



ERIC 



24 

they might be reduced in future indicator efforts. 



ERIC 



25 

Chapter 3 
INSTRUCTIONAL CONTENT 

Instructional content, or the topics covered in a particular course, form the core of 
the implemented curriculum. Although it is mediated through the instructional strategies 
that teachers use, content is the dimension of curriculimi whose relationship to student 
achievement is the most well-established. It is also the aspect of curriculum that has 
proven the least problematic to measure through teacher surveys. National surveys such 
as NAEP and NELS have typically asked teachers whether they taught or reviewed any 
of the items on a general list of topics. By asking teachers whether their smdents had 
been taught the content reflected in specific test items, SIMS researchers expanded the 
type of survey questions used to probe topic coverage in order to measure more precisely 
students' opportunities to learn (OTL). The SIMS experience, in particular, suggests that 
valid data on instructional content can be obtained from teacher surveys. That research 
found that mean teacher OTL ratings provided a reasonably good predictor of between- 
system achievement differences and consequently, had some predictive validity at the 
level of national education systems (Travers and Westbury, 1989). 

However, despite the success of the SIMS strategy in documenting topic coverage, 
several questions remain about the reliability and validity of content data obtained from 
national surveys. First, most U.S. surveys ask about topics at a level of generality that 
either does not differentiate the breadth or depth at which topics cutting across multiple 
courses are covered (e.g., polynomials, properties of geometric figures) or probes at the 
level of a single course title (e.g, trigonometry, calculus) and does not give any indication 
of the specific content of that course. Second, surveys typically do not ask about the 
amount of time spent on a particular topic~i.e., the number of periods or lessons 
devoted to the topic. Finally, it is difficult to validate topic coverage in a cost-effective 
way for indicator purposes. Unless aU of a teacher's exams and assignments are 
collected for an entire school year, these sources cannot provide an accurate picture of 
the topics covered or the depth of coverage. Textbooks are the obvious alternative 
because they typically span an entire course and can be collected and coded without 
burdening teachers. However, given earlier research on elementary mathematics 

ErJc 30 



26 

showing that teachers using the same text vaiy widely in their topic coverage and pacing 
(Freeman et al., 1983) and the fact that teachers do not typically cover an entire 
textbook and may supplement it with other materials, textbooks can only be used as a 
source of validation if information is also available about how they are used by individual 
teachers. 

We tried to address each of these issues in designing our strategy for validating 
survey data about instructional content As noted in the previous chq)ter, our surv<^ 
contained an expanded list of topics that, in addition to the more general topics included 
on the NAEP and NELS surveys, included ones at a greater level of specificity. Our 
survey also asked teachers about the number of periods devoted to each topic Because 
of the need to validate topic coverage information that spanned an entire year, we did 
rely on teachers' textbooks as the primary source for validation. However, we asked 
them exactly which chapters they covered and how closely they followed the textbook. 
Only those chapters that teachers indicated they had ah-eady covered or planned to cover 
by the end of the year were coded for content coverage. In addition, although we could 
not use either teachers' exams (because they covered only one semester) or their 
assignments and logs (which covered only five weeks) as a primary source for validating 
topic coverage, we did use them as a secondary source. 

Our analysis suggests that there are differences across topics in the accuracy with 
which their coverage is reported on teacher surveys. Those topics covered in upper-level 
courses tend to be reported with great accuracy, while the topics reported with less 
accuracy tend to be those covered in lower-level courses, more general topics, those 
associated with the mathematics reform movement, and ones that are used as tools in 
the learning and application of other topics (e.g., graphing, tables and charts). Before 
presenting the findings from our validation analysis, we provide some examples of the 
kinds of information that are available from survey data on topic coverage. 
DESCRIBING COURSE CONTENT FROM SURVEYS: ILLUSTRATIVE EXAMPLES 

Perhaps the most important use for topic coverage data is in describing the 
distribution of students' opportunities to learn the content associated with a particular 
course. A number of studies (e.g. McDonneU et al., 1990), including presentations of the 



27 

SIMS data (Kifer, 1993), have used "box and whiskers" plots to illustrate how topic 
coverage for a particular course is distributed. We present similar data here, and then 
elaborate by moving beyond the standard of whether or not a set of topics has been 
taught as new content to showing how the amount of class time spent on core topics can 
vary. 

Table 3.1 categorizes those topics from the surv^ that are commonly covered at 
four different course levels. These four sets of topics are not meant to be exhaustive, 
but they do represent at least part of the core content for each of the courses listed. 
Figure 3.1 compares the distribution of the pre-algebra and algebra topics taught as new 
content in courses below algebra I with that taught in algebra courses. Figure 32 makes 
the same comparisons, but uses as a criterion whether the two sets of topics were taught 
for six or more periods--i.e., covered in some depth. The line across the middle of each 
"box" represents the median; the lower and upper boundaries of the box equal the 
twenty-fifth and seventy-fifth percentiles; the Vhiskers" depict the tenth and ninetieth 
percentiles; and the dots represent outUers beyond the tenth and ninetieth percentiles. 

In terms of exposure to core algebra topics, there is little variation in OIL across 
the algebra I classes in our sample. Even in those classes in the lowest quartile, teachers 
report that seventy-five percent of the algebra topics are covered tnd most classes cover 
80 percent or more of the core topics. Yet there also seems to be a fair amount of 
attention in these algebra classes to lower level content, with the typical class also 
covering 60 percent of the pre-algebra and arithmetic topics as new content When we 
examine the distribution of more indepth coverage in figure 32, we see considerably 
greater variation m the proportion of algebra classes in which core topics are covered for 
longer periods of time. In a typical class, about half of the topics are taught over six or 
more periods, but a few classes receive ahnost no indepth coverage of algebra topics 
while a few at the other end of the distribution spend extended time on most core topics. 
The boxplot showing the distribution of indepth coverage of pre-algebra and arithmetic 
topics in the algebra classes indicates that while these classes may be covering a 
significant proportion of the lower-level topics, they are doing so for relatively brief 
periods of time. The typical algebra class only covers 20 percent of the pre-algebra 



er|c a, 



Table 3.1 

REPRESENTATIVE TOPICS COVERED AT FOUR COURSE LEVELS 



Pre-algebra and Arithmetic 



Ratios, proportions, and percents 
Conversions among fractions, 

decimals and percents 
Laws of exponents 
Square Roots 

Applications of measurement 
formulas (e^g., area, volume) 



Algebra I 



Polynomials 
Linear equations 
Slope 

Writing equations for lines 
Inequalities 
Coordinate Geometry 
Distance, rate, time problems 
Quadratic equations 



Algebra II 



Polynomials 
Quadratic equations 
Logarithms 
Conic sections 
Slope 
Sequences 

Matrices and matrix operations 



Math Analysis/Pre-Calculus 



Trigonometry 
Polar coordinates 
Complex numbers 
Vectors 
Limits 



Pre-Algebra and Arithmetic Topics 



Algebra Topics 



Below algebra 
classes 



100 -1 



Algeipra I 
classes 



Below algebra 
classes 



Algebra 
classes 




Figure 3.1 - Proportion of pre-algebra and algebra I 
topics taught as new content, by course level 



Pre-Algebra and Arithmetic Topics 



Algebra Topics 



Below algebra I Algebra I 
classes classes 



Below algebra I Algebra 
classes classes 



100 1 



eo 



60 - 



dO - 



20 - 





Figure 3.2 - Proportion of pre-aigebra and algebra 
topics taught as new content for six or more class 
periods, by course level 



3o 



28 

topics for more than six periods. 

In contrast to the algebra I classes, the pre-aigebra classes in our sample show 
somewhat greater variation in topic coverage, but even at this level, most of the core 
topics are being covered in the typical class (80 percent in the median class). The 
proportion of algebra I topics that are included in these classes varies considerably, with 
those in the top quartile getting some exposure to half of the algebra topics and those in 
the bottom quartile to about a quarter of the algebra content However, figure 32 
illustrates why asking about whether or not a topic was taught without asking about the 
amount of time spent on it can result in a misleading picture of OTL As the boxplot 
showing the coverage of algebra I topics in pre-algebra classes indicates, the median class 
covers only 12 percent of the algebra topics for six or more periods. Just as the algebra 
teachers in our sample spent Uttle time teaching lower-level content, the pre-algebra 
teachers only briefly introduced their students to algebra topics. 

Figures 33 and 3.4 present the same kind of information for algebra n and math 
analysis classes. What is striking about these classes is that in each case there is Uttle 
variation in coverage of the core course content In the typical algebra H class, about 80 
percent of the algebra II topics are covered and about the same proportion of math 
analysis topics are covered in the math analysis courses. Similarly, in both courses, most 
of the core topics are covered for six or more periods. StiU there is considerable overlap 
in topic coverage between the two courses, with the median algebra H class covering 40 
percem of the math analysis topics (20 percent for six or more periods), and the median 
math analysis class covering 71 percem of the algebra H topics (35 percem for six or 
more periods). 

This presentation of topic coverage expands on past uses of course content data to 
illustrate how knowing the amount of class time spent on a set of topics provides a more 
accurate measure of students' opportunity to learn. With this additional information, we 
found that some students are receiving indepth instruction on core topics, while others 
are only briefly introduced to them. Comparing topic coverage across course levels also 
allows us to estimate the distribution of course-level contem that students are receiving, 
as compared with their exposure to topics that are either above or below the level of the 



Algebra ii Topics 



Math Analysis Topics 



Algebra II Math Analysis 
classes classes 



Algebra II Math Analysis 
classes classes 



T3 
O) 
t. 
(U 
> 

o 
u 

tn 
u 

a. 
a 



0) 

cn 

10 

c 

(U 

u 
t. 

(U 

a 



100 -1 



80 - 



60 - 



40 - 



20 - 



0 - 



Figure 3.3 - Proportion of algebra II and math analysis 
topics taught as new content, by course level 



3'.' 



ERIC 



Algebra II Topics 



Math Analysis Topics 



Algebra II Math Analysis 
classes classes 



Algebra II Math Analysis 
classes classes 



100 n 



•a 
m 
c 

(U 

> 

o 
u 

cn 
u 

a 
o 



(D 

c 

0) 

u 
c 

(U 

a 



EO - 



60 - 



AO - 



20 - 



0 - 





Figure 3.4 - Proportion of algebra I! and math analysis 
topics taught as new content for six or more class 
periods, by course level 



ERIC 



29 

course. Using indicator data on topic coverage in these ways, can provide a quite 
thorough depiction of instructional content However, the quality of that picture depends 
on the accuracy of the survey data from which it is drawn. 
CONSISTENCY BETWEEN THE SURVEYS AND THE COURSE TEXTBOOKS 

For those chapters of their textbooks that teachers indicated they had covered or 
planned to cover before the end of the year, coders counted the number of lessons 
devoted to each of the topics listed on the post-data collection survqr.*° These counts 
were then converted into the six response categories in question 11 of the survey." 
using the algorithm outlined in the footoote to Table 32, 

Across all topics, the average rate of direct agreement between the surveys and 
textbooks is 42 percent and within one survey response category, 72 percent The 
average level of agreement suggests that although survey data may not provide a very 
precise picture of the time spent on different topics, such information is reasonably 
accurate at the level of being able to ascertain that a topic has been taught not at all or 
for only a few periods, for a week or two, or for several weeks. Given that most OTL 
measures tend to be fairly crude (i.e., typically reporting whether or not a general topic 
has been covered with no information about the time spent on it), being able to report 
topic coverage at this level of specificity represents a significant improvement 

But the mean rate of agreement masks significant differences across topics. Table 
32 lists those topics (15 of the 40 included on the two forms of the survey) for which the 
rates of agreement were below the mean. Table 33 lists the eight topics for which the 
rates of agreement were the highest. Five of these topics" are covered primarily in 



instructions to the coders defined a textbook lesson as a subsection of a chapter that typically consists 
of a two to three page "spread," with a page or two of explanation foUowed by a set of exercises. Coders were 
instructed to include all substantive sub-sections and to omit enrichment, computer, review and chapter test 



sections. 



^hhc six response categories were 0 periods, 1-2 periods, 3-5 periods, 6-10 periods, 11-20 periods, and >20 
periods. 

^hlic five topics covered in upper-level courses are: caJcuius, measuKs of dispersion, integration, discrete 
math, and vectors. 



ERIC 



Table 32 



Consistency Between Topic Coverage As Reported on Surveys and As 

Coded from Textbooks* 



Topics Where Direct Agreement Is <42% and Agreement Within One Survey 

Response Cat^oiy Is <72% 



*DinKt »Whhin Pooiblc Brpianation For 
Agreemait OiieCau«oiy IncoudneiM^ 


Linear Equations 




42.4 


Low Inter-coder reliability (K-259) 


Conversion among Fractions, 
Decimals, Percents 


20j6 


706 


Use vs. spcd&z focus of teaching 


Conic Sections 


24 


60 




Polynomials 


21.7 


55 


Low inter-coder reliability (K-359) 


Graphing 


25 


66.7 


Use vs. !^)ecific focus of teaching 


Inequalities 


273 


69.7 


Low inter-coder reliability (K-324) 


Tables & Charts 


25 


66.7 


Use vs. specific focus of teaching 


Proportional Reasoning 


2SS 


65.5 


—Lade of understanding of common 

meaning amtng re^ndcnts 

-Low inter-coder reliability (K- J)67) 


Paneras & Functions 


22S 


64.6 


Lade of understanding or common 
meaning among respondents 


Ratio, Prc^rtion, Percents 


25.7 


543 


Low inter-coder reliability (K«a89) 


Sequences 


29.6 


593 




Slope 


26.7 


683 


Low inter-coder reliabiUty (K«324) 


Math modeling 


283 


SOS 


Lack of understanding or common 
meaning ammig respondertts 


Estimation 


333 


70.6 


Use vs. specific focus of teadung 


Matrices 


423 


692 





Tcxtboote were coded for the number of lessons in whidi a topic was covered. Because textbooks divide 
material differcntiy and indude vaiying numbers of lessons, the number of textbook lessons covered by 
teachers ranged from 34 to 181, with a mean of 93.6 and a standaid deviation of 29A In oitler to 
standardize across texts and to make valid comparisons with tcadiers' repons about the number of periods 
spent on a topic tiie number of lessons on a given topic was divided by die total number of lessons and 
muldpUcd by 140, whidi is an approximation of the total number of periods of instruction in a given 
academic year. The resulting number was then convened into one of the six i«ponsc options for reporting 

r"i°^Sr ^"^"^ ^"^^ ^"^^"^ 



4u 



Table 33 



Consistency Between Topic Coverage As Reported on Surveys and As 

Coded from Textbooks* 



Topics Where Direct Agrcenxent Is >50% and Agreement 
Within One Survey Response Cat^ory Is >80% 


% Direct % Within 
Agreement One Cat^ory 


Calculus 


852 


8&9 


Measures of Dispcrwn 


80^ 


923 


Int^ration 


80^ 


962 


Discrete Math 


76^ 


923 


Growth & Decay 


66.7 


86.7 


Vectors 


63 


88i> 


Probability 


59 


803 


Statistics 


50^ 


82 



• The process by which the textbook data were rccoded to be comparable with the survey 
data on topic coverage is described in the foomote to tabic 32. 



ERIC 



30 

upper-level courses, so only a minority of the teachers in the sample reported spending 
any time on them. In addition, because the scope and sequence of upper-level courses 
tends to be more precisely defined (e.g., because of the requirements of Advanced 
Placement tests and the narrower focus of the topics covered), teachers may be able to 
estimate more precisely the amount of time spent on a topic 

We identified six possible reasons for the low rates of agreement The first was 
the algorithm we used in converting the continuous data on topic coverage, as reflected 
in the textbooks, into the same categories that teachers used on the surveys. We knew 
that textbook lessons as a unit of analysis may not be exactly comparable to periods of 
instruction, and that the number of days available for instruction varies by school, and 
even for classes within the same school (e.g, depending on when assembUes, standardized 
testing and the like are scheduled). Consequently, we tried standardizing the two 
measures of topic coverage in a variety of different ways (eg., by raising or lowering the 
approximate number of periods of instruction over the year, by examining only textbook 
lessons and topics reported on the survey as having akeady been covered), but none of 
these transformations produced significantly different rates of agreement. 

Second, although the double-coding of textbooks indicated that coder error had 
been kept to within an acceptable level," some topics were more prone to coder 
disagreement than others. Six of the 15 topics in Table 32 had rates of interrater 
agreement that are only sUght or fair (i.e., a kappa statistic of <AOy\ and these were 
the lowest among the 40 topics coded. Although low inter-coder reUability was only 
found to be a problem for a few topics, it does suggest that coders may need more 
training and ongoing monitoring than we provided. 



58 r^r^J^t^f'^^'^' ^ on average, two coders agreed on the exact number of lessons devoted to a topic 
^percent of the time, were withm one lesson 75 percent of the time, and withm two lessons, 85 percent ofX 

^^A kappa statistic is a measure of interrater agreement when there are two unique raters and two or more 
raungs It u scaled to be 0 when the amount of agreement is what would be expected to ^rvX^^! 

substanual as 61-^, and ahnost perfect as .81-1.00. All buTnine^of the topi« codSjfrS Sl^Js S 
moderate levels of agreement or higher, with 19 of the topi« at the substaiSial Sor 



ERIC 



31 

A third explanation emerged from our follow-up interviews with a sub-sample of 
the teachers. We returned to talk with them because we noted internal inconsistencies 
in their surveys (most notably between their reports about instructional goals and 
activities) that suggested there was a lack of common understandmg for some terms 
associated with the mathematics reform movement Consequently, we asked teachers 
how they defined the topics we found to be the most problematic. Three appeared to 
present special problems for respondents, and one of thcsa-pmportional reasoning-was 
also the one with the lowest level of interrater agreement Teachers told us that the 
term proportional reasoning was vague; some reported that they had never seen the two 
words combmed. For math modeling, several teachers participating in the group 
interviews volunteered that they had no idea what the term meant, even though it 
appears in key mathematics reform documents (National Research Council, 1989; 
California Department of Education, 1992). For another topic, patterns and junctions, 
interviewees argued that the two concepts should be separated because they are not 
parallel or necessarily Unked concepts. In this instance, the newer reform Uterature 
(California Department of Education, 1992) seems to agree with our teacher 
respondents, and argues against NCTM's (1989) joining the two concepts. The reason is 
that patterns play a broadly applicable role in many or perhaps all strands of 
mathematics, while functions comprise one specific way of generalizing an observed 
pattern. Although the problem of teachers either not understanding the meaning of a 
term or interpreting it differently across respondents is considerably greater for 
instructional activities and goals than for topics, these examples do suggest that some 
survey data cannot be validly interpreted during a time in which language and 
accompanying practice are in transitioa While only a few topics may fall in this 
category, they are the ones of potentially greatest interest for charting trends in 
curricular reform. 

Fourth, for four of the topics with low rates of agreement, we found that teachers 
reported spending greater amounts of time on teaching them than the coders estimated. 
These four ionics-conversion among fractions, decimals, and percems; graphing; tables and 



ERIC 



32 

charts; and estimation-ha.ve in common that they are tools or building blocks that 
students can draw upon in working problems on other substantive topics. For example, 
some geometry textbooks have students record their measurements of geometric figures 
in a table format Although tables and charts is not the specific focus of teaching during 
such exercises, it is being used by students. We believe that teachers' overestimation of 
their coverage of these topics stems firom their not making a distinction between having 
smdents use the concept while working on other topics and having the topic as the 
primary focus of a substantive lesson. This particular problem can be addressed by 
including clearer instructions in survey prompts.^ 

In designing this smdy, we assumed that there are two other factors which affect 
the validity of topic coverage data obtained from teacher surveys. One is the high level 
of generality that characterizes the topics included on most national surveys. Because we 
assumed that specific topics wiU yield more valid data, as well as a more detailed picture 
of students' opportunity-to-leam, we included a greater number of specific topics on our 
post-data collection survey. Several of those topics were elaborations of the NELS 
topics. As Table 3.4 indicates, more specific topics had higher rates of agreement 
between the surveys and textbooks than did the general NELS topics. We recognize that 
there are trade-offs associated with includmg longer topic lists, particularly on surveys 
that must serve multiple purposes in addition to gathering data on curriculum. But our 
research suggests that, despite the potentially greater teacher burden, national surveys 
will need to be more comprehensive if they are to provide vaUd data on topic coverage. 

A final factor likely to affect the reliability and vaHdity of survey data on topic 
coverage is the time fi-ame over which teachers arc asked both to recaU what they have 
ah-eady taught and to estimate what they will teach over the rest of the year. We 
assumed that such data are likely to be considerably less precise than if teachers are 



As indicated m Tabic 32, wc found that these four reasons helped explain most of the lowest rates of 
a^ment between the textbooks and the surveys. However, for three topics-^on/c sections, sequences, and 
mamc«-.none of these reasons apply. The rate of interrater agreement at the moderate level or «bo4 for 
tocsc topi«; teacher respondents agreed on their meaning; and they did not systemaUcaUy overestimate their 
coverage of these topics as they did for those that might be dassified as tools. 



33 

asked to report on topic coverage concurrently with their actual teaching. We were able 
to test that assumption by comparing teachers' reports of topic coverage on their daily 
logs with the assignments they made over the same five weeks. Eleven topics were 
reported as being covered over this period by at least a third of the teachers. Across 
those topics, the rate of direct agreement was 58 percent and 83 percent within one 
category, using the same response categories as the survey. To ensure that this higher 
rate of agreement was not an artifact of converting the continuous data fi-om the logs 
and assignments into the survey categories, we also calculated the rate of agreement 
between the exact number of periods teachers reported covering a topic and the number 
of times coders identified it as being included on the assignments. We found that the 
average rate of agreement within one class period or assignment was 40 percent and 59 
percent within two. The extent of improvement in the quality of data collected 
simultaneously with the teaching of a topic is iUustrated by the fact that eight of the 11 
topics are also ones that had the lowes t rates of agreement between the survey and the 
text. Using the log and assignment data multipUed the ra - of agreement by between 
one-and-a-half and two-and-a-half times for these topics. 

Qearly, asking teachers to report on their topic coverage concurrently with their 
teaching of the content is not a feasible strategy for routine data coUection in national 
surveys. The gains in improved reUability would be offset by an incomplete picture of 
the content presented to students throughout the year. In addition, having a large 
number of teachers report on a daily basis for even several weeks might increase the 
costs of national surveys and would most likely result in a lower response rate overall. 
CONCLUSIONS 

Our analysis suggests that teacher survey data can provide a reasonably accurate 
of topic coverage. If the standard is knowing whether or not a topic has been taught and 
if it has been taught, whether it has been covered over several periods, for a week or 
two, or for several weeks, then teacher self-reports are reUable. However, our data 
provide a strong rationale for including more specific curricular topics on surveys. Not 
only do they provide a more detailed and comprehensive picture of students' 
opportunity-to-leam, but teachers' reports on these topics are more reUable than their 



34 

reports about general topics which encompass multiple sub-topics and make precise time 
estimates difficult. We recognize the trade-offs in requesting more detailed information 
about topic coverage, but we would argue that the topics currently included on national 
surveys such as NELS provide data that are too general to be useful, particularly in 
measuring OTL Consequently, the gains in both reliability and validity may more than 
oSsct the additional biu'dea 

In addition to the need for more detailed, enhanced topic lists on national 
surveys, our artifact analysis suggests that validation studies are necessary to pinpoint the 
sources of measurement problems. One area that will continue to be problematic is the 
lack of common agreement on the meaning of key terms associated with the mathematics 
reform movement. Such terms need to be included on national surveys to chart trends in 
topic coverage, but without accompanying validation studies, the data are likely to be 
misinterpreted. Consequently, the use of indepth interviews and focus groups to 
supplement artifact analyses will help in identifying the different understandings that 
teachers hold of concepts central to expected changes in mathematics teaching. But even 
independent of the current flux in curricular practices, validation studies are necessary. 
By coUecting detailed data from multiple sources over shorter periods of time (e.g., 
through daily logs and assignments), such studies can provide a benchmark against which 
to judge the reliability of routine survey data that require teachers to recall and estimate 
topic coverage over longer periods of time. 



4 b 



ERIC 



35 



Chapter 4 
INSTOUCnONAL STRAIEGY 

Instructional strategy is a multi-faceted dimension of curriculum that is 
considerably more difficult to measure than instructional content Part of the reason lies 
in its scope. Embodied in this concept are all the various approaches used in the 
teaching and learning process. It includes what teachers do (eg., lecture, lead 
discussions, work with small groups) and what smdents do (eg., work individuaUy or in 
groups, work on projects, use manipulatives). But it also includes the type of work that 
students are assigned-the focus and format of that work, how it is evaluated, and the 
level of understanding expected of students. 

Instructional strategy is also difficult to measure because surveys typicaUy cannot 
capture the subtle differences in how teachers define and use different techniques. For 
example, one teacher might lecture directly from the textbook, and do most of the 
talking in the class. Another might draw on material from sources other than the text 
and engage students in Uvely give-and-take exchanges. Without a detailed set of survey 
probes, however, both teachers are likely to report that they spend most of the period 
lecturing. Even a detailed survey would likely fall short in representing these two 
classrooms because it could not adequately measure the nature of the interaction 
between smdents and the teacher.^^ Yet someone observing those classrooms would 
identify two very different kinds of instruction. 

Despite these major limitations, however, there is still much that survey data can 
tell us about instructional strategy. Such data can describe the major dimensions of 
classroom processes and how they vary across course-levels and types of schools. 
National sur^^ey data, coUected on a periodic basis, can docmnem trends in teachers' use 
of generic instructional strategies. Such information is important in determining whether 
or not teaching is changing in ways consistent with the expectations of curriculum 
reformers and their policymaker allies. 



J^This shortcoming of survey instrumentation also appUes to artifact data In fnrf «,*;f,^ j » 

4 1 



36 

PORTRAYING INSTRUCTION FROM SURVEYS: ILLUSTRATIVE EXAMPLES 

The clearest picture of instruction that emerges from our survey data is teachers' 
reliance on a few strategies that they use frequently. A large proportion of teachers 
reported engaging in traditional activities such as lecturing (87 percent) and correcting or 
reviewing homework (86 percent) on a daily basis, while the majority reported engaging 
in activities consistent with the mathematics reform movement on only an infrequent 
basis or not at all-e.g., 65 percent of teachers reported having student-led discussions 
once or twice a semester or not at all; 61 percent rarely or never discussed career 
opportunities in mathematics, and slightly fewer than half of our respondents (49 
percent) have their students work in small groups at least once or twice a week. 

Calculator use in these classrooms is very high, with 74 percent of the teachers 
reporting that students use them ahnost daily. Usage is just the opposite for computers, 
however. Students use computers on a daily basis in less than 2 percent of the courses, 
and in over half of the classes (52 percent), computers are never used to solve exercises 
or problems." Most teachers reported that the majority of class time is spent in direct 
instruction, with student discipline and administrative tasks such as taking attendance 
consuming less than 10 percent of their in-class time. 

The tendency in national indicator reports such as those produced 1^ NAEP (e.g., 
Mullis, et al., 1991; 1994) has been to focus on single questionnaire items, examining 
each teaching strategy separately rather than seeking to understand how teachers link 
discrete strategies to create instructional repertoires. Given tnat teachers rarely use just 
one strategy and typically rely on several even in the same lesson, reporting on an item- 
by-item basis fails to produce a coherent picture of instruction. Consequently, we probed 
our survey data to see if we could identify different instructional repertoires in which 
teachers combine a number of separate strategies. Our first approach consisted of 
grouping instructional techniques according to the strategies advocated in reform 



The low mddcncc of computer use reported by the mathematics teachers in our sample is similar to the 
level of reported use found by Weiss (1994; in her naUonal survey of science and mathematics teachers. Fifty-six 
'^'^^if ^ ^ mathemaUcs dasscs m her sample never use computers. The level of teacher lecture, 
textbook usage, and small group >w)rk in our sample is also similar to the patterns documented by Weiss. 



ERIC 



4o 



37 

documents such as the NCTM Professional Standards (1991) and the California 
mathematics framework (1992). We also created a list of techniques which seemed to 
represent the more traditional teaching repertoire to which the reform documents were 
reacting. Table 4.1 shows the two groups of strategies. To test the consistency of these 
groupings, we scaled them, and found that they cohered reasonably weD with an alpha of 
.72 for the reform scale and .62 for the traditional scale. The parts of the scale that 
were intended to operate in a negative direction (e.g., not lecturing as part of the reform 
repertoire) did in fact reverse as expected. 

We also conducted a factor analysis as another way of identifying instructional 
strategies that occur together. Three factors emerged that seem to have substantive 
meaning. The first, shown in Table 42, is dominated by discussion strategies with a 
strong emphasis on the role of students in dass discussions. The second has a clear 
demonstration component Student participation strategies are part of this repertoire, 
but it is much more teacher-directed than the first one. The third factor, with only two 
components, is the closest to what would be considered traditional teaching with the 
teacher lecturing and students responding to the teacher's questions. 

The lack of variation in classroom practice across the teachers in our sample is 
the primary reason why the results of the factor analysis indicated that most of the 
instructional strategies included on our survey do not fit into a common factor space. 
Nevertheless, the coherence of the reform and the traditional practice scales and the 
high face validity of the factors that did emerge suggest that future efforts to link 
instructional strategies and student outcomes should move away from separate analyses 
of single questionnaire items and focus greater attention on identifying and 
understanding different instructional repertoires. 

The picture of instruction that emerges from our survey data is quite consistent 
across course levels. As might be expected, those teaching algebra I and courses below 
that level had students practice or drill on computational methods more fi-equently than 
teachers in higher level courses. In addition, the number of minutes per day of 
homework that teachers assigned was significantly greater for higher level courses, with 
the mean ranging fi-om only 19 minutes per day in courses below algebra I, to about a 



ERIC 



Table 4.1 
Instructional Repertoire Scales 





Reform 


Traditional 


Lecture 


— 


+ 


Students respond oralhj' to questions 




+ 


Student-ied discussions 


+ 




Teacher-led discussions 




+ 


Review homework 




+ 


Students work individually 






Students give oral reports 


+ 




Administer a test 


— 




Administer a quiz 






E>iscuss career opportunities 






Small groups work on problems 


+ 




Whole class discusses small groups' solutions 


+ 




Students read textbooks 




+ 


Teacher summarizes lesson's main points 




+ 


Students work on next days homewoik 




+ 


Students work on projects in class 


+ 




Teacher demonstrates an exercise at board 


— 


+ 


Students work exercises at board 






Teacher uses manipulatives to demonstrate a concept 






Students work with manipulatives 


+ 




Students practice computational skills 




+ 


Students work on problems with no obvious method of 
solution 


4. 

■ 




Students use tables and graphs 


+ 




Students use calculators 






Students use computer 


+ 




Students respond to questions that require writing at 
least a paragraph 


+ 





ERIC 



Table 42 

Instructional Repertoire Factor Matrix* 





Factor 1 


Factor 2 


Factor 3 


Student-led discussions 




CI 




-as 


M 


Teacher-led discusaons 




J^7 




-J02 


34 


Small groups work on problems 




J$ 




-J02 


-.18 


Whole class discusses small groups' solutions 




.74 




sn 


.14 


Students use calculators 




.54 




•19 


-.18 


Students respond orally to questions 


•XV 




.78 




.89 


Administer a test 


11 

•JLl 




-.85 




-.06 


Teacher demonstrates an exeitise at board 


-JU 




.54 






Teacher uses manipulatives to demonstrate a concept 






31 




-22 


Students work with manipulatives 






£S 




-M 


Lecture 


•IS 


m 


.80 


Review homeworic 




.-15 


20 


Students woric individually 


-16 


sn 


-in 


Students give oral rqx>rts 


21 


39 


•v/ 


Administer a quiz 


.14 


-m 


30 


Disciiss career opportunities 


23 


•M 


39 


Students read textbooks 


JOO 


.10 


24 


Teacher summarizes lesson's main points 


.02 


.46 


.40 


Students woric on next day's homeworic 


-.16 


-.11 


24 


Students work on projects in class 


.12 


.43 


-.16 


Students work on exercises at board 


.20 


25 


J)9 


Students practice computation skills 


-Jl 


-22 


-.08 


Students woric on problems with no obvious method of 
solution 


.44 


36 


.11 


Students use tables and graphs 


.43 


.19 


-.18 


Students use computers 


.06 


sn 


-22 


Students respond to questions tliat require writing at 
least a paragraph 


38 


31 


-25 


Eigenvalue 


4.62 


231 


Z14 



* Although six factors were extracted, the three primary factore are presented here 
because the scree diagram indicated that the remaining factore did not explain a 
substantial, additional proportion of the variance, and they were not readily interpre 
on substantive grounds. 



38 

half hour a day for algebra I (32 minutes), to ahnost an hour a day for calculus (55 
minutes). Beyond these differences, however, there were few other significant 
differences across course levels. Teachers in higher level courses were just as likely as 
those in lower level courses to lecture frequently, have students work on their next day's 
homework in class, and then correct or review that homework. Similarly, the infrequency 
of strategies such as student-led discussions and small group work was quite similar 
across course levels. 

One course, however, does seem to rely on different teaching strategies. Although 
the number of calculus classes in our sample was too small to make any generalizations, 
those classes did differ from the other courses in the sample in several major ways. 
Calculus teachers reported lecturing less frequently and relying more on small group 
work by students. But with this exception, the similarities across courses in our sample 
are far more striking than the differences. 

CONSISTENCY BETWEEN THE SURVEY AND THE ARTIFACTS 

Since instructional strategy is the dimension of curriculum least amenable to 
validation through written artifacts, we were limited in our ability to measure the 
consistency of survey responses with other data. However, for 14 of the 26 instructional 
practice items listed on the survey, we could compare teachers' survey responses at the 
end of the semester with their daily log entries during the five weeks of artifact data 
collection. We were also able to compare teachers' survey responses about the format 
and other characteristics of their exams and quizzes with their artifacts. Finally, we were 
able to compare teachers' actual homework assignments with their responses to a survey 
question about the characteristics of those assignments. 
Logs and SurvQ^s 

Table 43 shows the rate of exact agreement between the surveys and the logs on 
the frequency with which teachers reported engaging in a variety of instructional 
activities. The level of agreement within one survey response categoiy is also 



ERIC 



Table 4^ 



Consistency Between the Reported Frequency of 
Instructional Activities on the Logs and the Surveys 





Direct 

A lifi'iiimiMni 


<& Wilkin t\na 

TO wiuiui v^ne 
Survey 

Category 


Lecture 


57.4 


96.7 


Have students respond orally to questions 


46.7 


90 






flfi 7 


C\nrrt^ct ot t^vtpw Kompworlc in f*1fiQs 


oo.o 




assignments 






Administer a test 


60 


83.3 


Have students work with other students in 
small groups 


50 


100 


Have students work on next day's homework 
in class 


34.6 


61.5 


Demonstrate exercise at the board 


50 




Have students work exercises at the board 


59 


91.8 


Use manipulatives to demonstrate a concept 


52.5 


83.6 


Have students work with manipulatives 


48.3 


86.7 


Have students use a calculator 


47.2 


90.6 


Have students work on a computer 


49.1 


86.8 


Mean rate of agreement 


48.0 


86.5 



reported." Given that the data in this table compare information that teachers 
provided at three times over the course of the semester (the last time within one week of 
completing the survey) with their responses to the survey completed at the end of the 
semester, the rate of direct agreement is quite low." 

The relatively high rate of agreement within one survey response category does 
suggest, however, that the problem may lie in how the surv^r response categories were 
constructed. The distinctions among them may not have been sufficiently discrete or 
meaningful to respondents. To check this possible explanation, we con^ared the mean 
frequency of instructional activities, as reported on the logs, with the category teachers 
used in responding to the survey. We foimd that the response options teachers used on 
the surveys did not always reflect actual differences in the frequencies of instructional 
activities as they had reported them on the logs. The most common problem was the 
lack of significant differences in the frequency of activities as reported on the logs for the 
survey response categories, almost every day and once or twice a week, and for once or 
twice a week and once or twice a month. For eight of the 14 activities for which a 
comparison was made, the mean for two of the survey categories was virtually the 



In order to compare the survey responses with the log entries, we had to convert the continuous data from 
the logs-Le., how many times over the 25 days of dato coUcction teachers reported engaging in an activity-into 
the catcgoncal response options used m question #13 on the survey. Anything that was done 60 percent or more 
of the time ( > 15 days) was receded as aimost every day, any activity that occurred 25-59 percent of the tune (5-14 
days) was receded as orxce or twice a week, and 24 percent or less (< 5 times) was receded as once or twice a 
month. We included never as a comparison category, but did not have a comparison category for once or twice 
a semester for the log data. 

^'in comparing the survey and log data, we chose to use the rate of direct agreement and the rate of 
agreement within one survey response category as our measure of consistency. Another recent study (Porter et 
aL, 1993) that made similar comparisons between log and survey data used correlation coefficients as the 
measure of consistency Thiu study fo^^ 

(2-31) and concluded that "the vahdaUen resulU were very encouraging' (A-5). 

Although the correlations for seven of the 14 instructional practice items on our survey and loc were 
^eater than 30 (and significant at the .01 level) and only two items had correlations below JO (with one 
significant at the .05 level and the other nonsignificant), we did not find this analvsis to very informative 
Correlauon coeffiaents conflate matches and mismatches across categories in a way that makes it difficult to 
rctneve mformation about spcdfic patterns of responses to the two types of data coUection instruments. A 
clearer and more mtmtively attractive way to compare tiie two is to examine how close the agreement is between 
the two mdicators. The percentage agreement statistic measures tiiat level of consistency dirrctiy 



ERIC 



40 

same.^ Although respondents reported no problems in using these response categories, 
the log data suggest that teachers who had engaged in an acdvity with the same 
frequency used different categories in reporting on it on their surveys. The response 
categories that are the most problematic are almost every day and once or twice a week, 
with the log data suggesting that reliable distinctions cannot be made between these two 
categories, based on stirvey data. 
Exams and Surveys 

Another aspect of instructional strategy that we were able to validate from the 
artifacts was the format and characteristics of teachers* exams. Teachers were asked to 
indicate the proportion of their tests and quizzes that were multiple choice, short-answer, 
essay, and open-ended problems. In addition, they were asked to indicate the proportion 
of their exams that included items with certain characteristics such as requiring students 
to describe how to solve problems or problems with more than one possible answer or 
one possible approach. Figures 4.1, 42, and 43 conyare the means for the survey 
responses and the artifact coding. 

On six of the 15 questions that teachers were asked about their exams, the level 
of agreement between teachers and coders was 90 percent or more. There was high 
agreement between teachers and coders in the proportion of exam items that were 
multiple choice (with the difference in means between the two sources less than 3 
percent) and those that were essay (the difference in means was 3 percent). Similarly, 
there was high agreement on the proportion of test items requiring the use of tabular or 
graphical data, the proportion requiring students to describe how to solve problems, and 
the proportion of problems with more than one answer. However, there were major 
disagreements between survey respondents and coders about the proportion of exam 
items that were short-answer (a difference between the two sources of about 50 percent) 
and that were open-ended problems (a 51 percent difference). 



^or example, for lecture, the mean over the twenty-five days of data collection for teachers mdicatin«c/«ttwf 

J^'^m1irJ?ri',T^T,l'**^ ^"J^"* reporting o«c* or m,« a week, 116. For teacher-tedZussion, 
the means were 12.8 and 12.1 respectively; demonstrate an exercise at the board, 15.4 and 14. Similarly fm 
administer a test, the mean for once or twice a week we& 42 and for once or twice a month, 43. 



i 80 T 




Multiple Short Essay Open- 

Choice Answer Ended 



Figure 4.1 - Coa5>arison of item Pormata on Exams 



3EST COPY AVAILABLE 



ERIC 



0) 

E 
re 

X 

0) 

c 
o 

w 
E 

0) 



c 

0) 

u 

0) 

a 

c 
re 

0) 



70 -r 



60 -- 



50 -- 



40 



30 -- 



20 



10 




H Survey 
O Artifacts 



Recall Use 
Definitions Algorithms 



Describe Explain Apply to Analyze 
Solution Reasoning Unfamiliar Suggested 

Situation Solution 



Figure 4.2 



Coaparisoa of Exam lt«m Characteriatics 



E 90 T 




Minor Multiple Multiple Multiple Steps Tables* 

Variation of Answers Approaches Graphs 
Past Work 



Ficrur* 4.3 - ComparlBon of Sxam Problem Types 



I 



DO 



ERIC 



41 

Other questions where the level of disagreement between the surveys and the 
artifacts was greater than 20 percent were the proportion of exam items: requiring 
students to recognize or recall definitions; requiring the use of algorithms to solve 
problems; that are minor variations of homework, class exercises, or problems; with more 
than one possible approach; and requiring more ti.>an one step to reach a solution. As 
illustrated in the histograms, the coding of teachers' rjtams presents a more traditional 
picture of their ^proach to evaluating students than thcty reported on their surveys. 
According to the artifacts, teachers were less likely to include items that required 
smdents to describe how they solved problems, explained their reasonmg, or applied 
concepts to different or unfamiliar situations than they indicated on their surveys. 
Similarly, the exams evidenced a smaller proportion of items '*ith more than one 
possible answer, more than one possible approach, or that required more than one step 
to reach a solution than respondents estimated on their surveys. 

The survey response categories related to exam characteristics that were 
particularly problematic were short-answer, open-ended problems, and the difference 
between the two. Even though the survey instrument defined open-ended problems as 
those where students generate their own solutions, teacher respondents and coders viewed 
the exam formats quite differently, with teachers tendmg to classify as short-answer those 
items that coders categorized as pr>en-ended problems. In classifying exam items, coders 
used a narrow definition for short answer-viz., a question requiring students to complete 
a sentence or fill-in-the blanks. However, our follow-up interviews indicated that while 
some teachers had interpreted the term more broadly, they also differed in their 
definitions. For example: 

Short-answer is giving students a specific question to answer. On the other [open- 
ended problems], students can go in different directions and they are. graded on 
how indepth theu- answer is. (Math B teacher) 

A short-answer is when students are foUowing standard procedures. I use open- 
ended when I'm introducing new topics and I don't give students a way to do it I 
give them a good background on what they should be finding, bat I don't guide 
them. An open-ended problem ....is more about what students are expected to do 
than the format of the test. Just because a test doesn't have a blank to fill in for ' 
the answer doesn't make it open-ended. (Calculus teacher) 



ERIC 



42 

What's the difference among short-answer, essay, and open-ended? They're all 
the same. (Intermediate math teacher) 

The differing interpretations of these seemingly straightforward terms, as 
evidenced in the discrepancies between the survey responses and the exam coding and in 
the follow-up interviews, illustrate the need to define a number of survey items more 
precisely. In the case of questions about exam format, a more precise set of response 
options might be: 

- Multiple-choice 

- Problems where students generate a solution and show their work, but no 
written explanation is required 

~ Problems where smdents generate a solution, show their work, and are also 
expected to explain their work in writing 
This set eliminates the ambiguity inherent in distinguishing among short-answer, essay, 
and open-ended items, while making a dear distinction between multiple choice and 
constructed responses and within the constructed response category, between answers 
that require written explanation in addition to mathematical calculations. 
Homework Assignments and Surv^s 

Teachers' daily homework assignments are a final source of validation about their 
instructional strategies. Teachers were asked on the survey how often they assigned 
certain types of homework, and their actual homework assignments were then coded to 
determine the extent to which they reflected these characteristics. For those assignments 
that were not done entirely during class time, the rate of dkect agreement between the 
survey responses and the artifact coding was 48 percent and within one response 
category, 73 percent.^^ Behind this overall level of agreement is the same pattern that 

2^0f the 1407 individual assignments m our artifact sample, 230 (16 percent) were worked on only during 
class, 389 (28 percent) were worked on by students both during and outside class, 223 (16 percent) were done 
only outside class, and for the remaining 563 (40 percent), teachers did not designate where the assignments were 
done. In comparing artifacts with the responses to question 21, we chose to include all assignments, except those 
that were done only during class time, because we assumed most of the undesignated ones were homework 
assignments. However, we also checked the rate of agreement between the survey and those assignments that 
were done either completely outside class or worked on both during and outside of class. The rate of direct 
agreement was 42 percent and wthin one response category, 69 pcrcent-cssentially the same pattern as for the 
larger set of assignments. 



43 

was evident for other types of instructional strategies. The teachers in our sample rely 
on only a few types of assignments, and while they report the predominance of these in 
their survey responses, teachers still indicate greater variety in their assignments than 
were identified in the artifacts. Ninety-one percent of the assignments in our artifact file 
are either exercises or problems fi-om the textbook (72 percent) or exercises or problems 
from worksheets (19 percent). Similarly, 83 percent of the teachers report on their 
surveys that they assign textbook problems at least once or twice a week, and 72 percent 
report assigning worksheet problems with the same frequency. A substantial proportion 
of teachers also report that they nexfer give homework assignments that require students 
to write definitions of concepts (40 percent), solve problems for which there is no 
obvious method of solution (23 percent), or extend results established in class (29 
percent). Nevertheless, another sizeable group of teachers reported using these more 
innovative homework strategies and going beyond just textbook problems-cg., over half 
report assigning homework problems with no obvious method of solution at least once or 
twice a month. However, the artifacts present a picture of homework assijpmients that 
are more traditional with considerably less variety in the type of tasks required of 
students. TTie artifacts indicate that the proportion of teachers who never use more 
innovative homework strategies, such as assigning problems with no obvious solution 
method, exceeds the survey reports by a factor of between two to four, depending on the 
strategy. As a result, the rate of agreement between the two data sources is lowest for 
the more innovative homework strategies.^ 
CONCLUSIONS 

To the extent that we were able to validate the survey data collected on teachers' 
instructional strategies, we found that such data present an accurate picture of which 
instructional strategies are used most often by teachers, and they provide some indication 
of how teachers combine strategies during instruction. Although the picture of teaching 



22> 



Assi^cnt characteristics for which tiic direct rate of agreement between the two data sources was 30 
percent or lower included: reading the text or supplementary maUmals (30 percent), applying concepts or principles 
to different or unfamiliar situations (13 percent), solving problems jor which thert is no obvious method of solution 
(30 percent), and solving applied problems (28 percent). 



O Hi 

ERIC 



44 

that can be drawn from survey data is not a finely-grained one, it is likely to be valid 
because both the survey and the artifact data clearly show that there is little variation in 
teachers' instructional strategies. Basically, the majority of teachers use a few 
instructional strategies and use them often. 

Survey data are, however, limited in the precision with which they can measure 
how frequently teachers use particular strategies. Although teachers may find it easier to 
respond to questions that provide the five response categories typically included on 
national surveys, valid distinctions can probably only be made among activities that are 
done weekly or more often, monthly, once or twice a semester (i.e., infrequently), and 
never. Our analysis suggests that the distinction between daily activities and those done 
once or twice a week is not reliable. 

Our comparison of teachers' reports on their exam formats, as compared with an 
analysis of their actual exams, provides an example of where a rather simple re-wording 
of siuvey response options can produce more meaningfiol data. But the other 
inconsistencies we identified in comparing survey responses with the exams and 
assignments are symptomatic of a more serious problem. Teachers see their exams and 
assignments as exhibiting greater variety in their underlying mstnictional strategies than 
was evidenced in the artifact coding. Part of the problem might be addressed by 
providing more precise definitions of what is meant, for example, by problems with more 
than one possible approach or more than one step to reach a solution. However, 
discrepancies between the two types of data sources suggest more serious problems. 
Teachers see their instruction as more varied and less traditional than is reflected in 
their exams and assignments, and they do not share common meanings for some of the 
terms used by curriculum reformers. 

The impUcations for the design of more reUable and valid survey instruments are 
unclear at this point because so few teachers have adopted the instructional strategies 
advocated by NCTM and similar groups. Consequently, it is difficult even to conduct 
valid pilots of alternative survey question wordings or to test new measures of 
instructional strategies. In the next chapter we attempt to identify more precisely the 
source of problems and where possible solutions might lie for measuring curriculum 



ERIC 



45 

during a time of transitioiL We do so by probing teachers' reports on their instructional 
goals and then comparing those with the goals reflected in their artifacts and with our 
analysis of their instructional strategies. 



46 

Ch^terS 
INSTRUCTIONAL GOALS 

A final dimension of curriculum consists of the goals or objectives that teachers 
pursue as they present course content using different instructional strategies. Arguments 
for including measures of teachers' instructional goals as indicators of curriculum rest on 
the assmnption that the relative emphasis teachers accord different goals reveals 
something about their choices of instructional strategies. Furthermore, some empirical 
evidence suggests that teachers using the same textbook emphasize different aspects of it 
because they value the purposes of instruction differently (for a discussion of why goals 
should be included as curriculum indicators, see Cakes and Carey, 1989). 

However, teachers* reports of their course objectives reflect intended behavior 
and are less likely to be reliable than reports of actual behavior, such as topic coverage 
and instructional activities. Despite the obvious problems associated with measuring 
instructional goals, some have argued that questions about teachers' goals should be 
included in national surveys because they can function as lead indicators showing the 
direction in which couricwork and teaching in a particular subject may be heading. For 
example, teachers may report giving some emphasis to goals associated with the 
mathematics reform movement as a precursor to their engaging in activities consistent 
with those goals. While in some instances teachers' goals may signal a future change in 
their behavior, evidence from the implementation of educational innovations suggests 
that it would be inappropriate to make such an inference in reporting national trends. 
As McLaughlin (1990) notes in her overview of findings from implementation research, 
teachers' beliefs may sometimes foUow rather than lead their changes in practice, 
especially if the changes in practice are mandated. So, for example, teachers may be 
required to integrate topics across different subject areas or have students write in 
journals, but their beUef in the value of those practices may come only after they see that 
the changes have positive effects on their students. 

Our research confirms that instructional goals are the most problematic dimension 
of curriculum to measure. The consistency between survey responses and instructional 
artifacts was the lowest among the three dimensions we studied. However, in examming 

<3 . H i 

ERIC 



47 

the reasons for the inconsistency, we did learn something about teachers' perceptions and 
how they integrate new expectations and strategies into their existing approach to 
teaching. Consequently, we first describe how the teachers in oar sample viewed their 
goals and how the emphasis they reported giving them related to their reported use of 
instructional strategies. We then examine the consistency of teachers' self-reports with 
what coders identified as the goals reflected in teachers' exams and assignments. 
IDENTIFYING TEACHERS' INSTRUCOONAL GOALS FROM SURVEY DATA: 
SOME ILLUSTRATIVE EXAMPLES 

Teachers were asked to rate the emphasis they gave to twenty different 
instructional goals. Figures 5.1 and 52 indicate that although teachers' emphasis on 
goals that might be considered more traditional was somewhat greater, a majority also 
reported giving either a moderate or major emphasis to most of the reform goals 
hsted.^ 

As we did with instructional activities, we grouped together those instructional 
goals associated with the mathematics reform movement and another set that could be 
considered more traditional Those scales are displayed in Table 5.1. The reform goals 
scaled well, but the traditional goals did not We can only speculate that the 
conventional goals did not scale weU because, unlike the reform goals, they are not 
based on a coherent theory of instruction. Rather, they represent a set of goals that 
teachers have traditionally pursued, perhaps without regard to the linkages among them. 

We also conducted a factor analysis to determine if there were some subsets of 
goals that were related to each other in a meaningful way. Four factors emerged that 
could be substantively interpreted. The first factor shown in Table 52 includes six items 
that all deal with students developing critical thinking skills. The second factor includes 
four items that stress having students understand mathematical relationships in different 



A majority of respondents reported giving a moderate or major emphasis to kaming to nprtsent problem 
stmcmres m multiple ways (79 percent), integrating different brunches of mathematics (76 percent), raising questions 
Tf ^'fT^r^'"'^ (77 percent).)i„dwg and counterrxamples (66 pcrccr^), judging the validity 

of atgummts (51 percent), and dtscovenng generalizations (69 percent), in addition to the three reform itoais 
displayed m ngure 5.1. * 



Analyze Apply Use Tables Solve Where Write About 
Different Models to & Graphs No Obvious Math Ideas 
Approaches Real World Solution 



Pigur* 5.1 — Proportion of 7*ach«ra Reporting Major or 
Modarat* Bophaala oa Salactad Rafora Ooala 



80 -r 





Solve Write Memorize Perform Understand 
Equations Equations Facts Calculations Proofs 



Fismra 5.2 Proportion of Taachara Raporting Major or 

Modarata finphaaia on Traditional Ooala 



Table 5.1 
Instructional Goal Scales 





Reform 


Traditional 


Understanding the nature of proof 




+ 


ivA^uivi Uiiu^ Aovidy iUlu cUlU aicps 




+ 


Learning to represent problem stroctures in multiple 
ways (c^ graphically, algebraically, numerically) 


+ 




Inte£ia(iu£ different branchs of mathmnatirc (r^ a 
algebra, geometry) into a unified finamework 


+ 




Conceiving and analyzing the effectiveness of different 
approaches to problem sohing 


+ 




Performing calculations with speed and accuracy 




J. 
I 


Showing the importance of matii in daily life 






Solving equations 




J, 
f 


Raising questions and fonnulating conjectures 


+ 




Increasing students' interest in math 






Int^jating math witii other subjects 


+ 




Finding examples and counterexamples 


+ 




Judging tiie validity of aigumcnts 


+ 




Discovering gcnendizations 


+ 




Representing and analyzing relationshim iinnir tahim 
charts and graphs 


+ 




Applying mathematical models to rcal-worid 
phenomena 


+ 




Writing about mathematical ideas 


+ 




i-^Di^iuig a &iuuy or cxpcnmcni 


+ 




Writing equations to represent relationships 




+ 


Solving problems for which tiiere is no obvious method 
of solution 


+ 





a = .86 37 



ERIC 

ummmmmm 



Table S2 
Instructional Goals Factor Matrix* 





Factor 1 


Factor 2 


Factor 3 


Factor 4 


Rsisinz oufistions and fnnnubirin? cnniwtiinp^ 




M 






1^ 

-.14 


w 


Judcins tfie validity of areuingTits 




.73 




m 

Ml 


-.11 


21 


Understand the nanire of proof 




.72 




-JOS 


-Ji6 


-JO 


Increasing students* interest in matti 




.60 




2A 


-.11 


J3 


Hnding examples and counterexamples 




.S9 




J3 


22 


J4 


Discovering generalizations 




.59 




34 




23 


Learning to represent problem stnjctures in multqjle 
ways (tfr, graphically, algebraically, numerically) 


J7 




.72 




.03 


J2 


Conceiving and analyzing the effectiveness of different 
approaches to problem-solving 


^1 




.57 




-J02 


-J5 


Writing equations to represent relationships 


•JSi 




JS6 




.06 


21 


Integrating different branches of mathematics 
(e^n algebra, geometry) into a unified finamework 


35 








-28 


-JOS 


Sohdng equations 


-JUS 


30 




.64 




J5 


Writing about mathematical ideas 


11 


.23 




-.61 




29 


Solving problems for which there is no obvious method 
of solution 


J6 


27 




-Si 




•JM 


Applying mathematical models to real worid 
phenomena 


21 


AS 


■Jte 




.70 




Showing the importance of math in daDy life 


•J34 


JDS 


.17 




.52 




Integrating math with other subjects 


35 


-m 


-.04 




.47 




Designing a study or an experiment 


23 


-Sfl 


-.44 




w<7 




Memorizine facts, rules, and steos 


-M 


-26 


.04 


•m 


Performing calculations with speed and accuracy 


.18 


.15 


.11 


m 


Representing and analyzing relationships using tables, 
Chans, and graphs 


sn 


.42 


-30 


31 


Eigenvalue 




1.48 


1.43 


120 



♦ Although Hve factore were extracted, the four primaiy factoid are presented here because the sct« diagram 
indicated that the remaining factor did not explain a substantial, additional proportion of the variance, Md it 
was not readily interpretable on substantive grounds. 



Ho 



ERIC 



48 

v ays. like the first two factors, the fourth is consistent with the goals of the 
mathematics reform movement and deals with the application of mathematics to other 
subjects and to daily life. Only the third factor contains items that might be considered 
more traditional: teachers who emphasize solving equations give little emphasis to 
writing about mathematical ideas and to solving problems for which there is no obvious 
method of solution. 

We expected to see a positive correlation between reform goals and reform 
instructional activities and between traditional goals and traditional modes of teaching. 
The correlation between the reform goal scale and the reform instructional repertoire in 
Table 4.1 is strongly positive (r=.75), while the correlation between reform goals and the 
traditional teaching repertoire is negative (r=-Jl). However, traditional goals and 
instructional activities are not correlated. The correlation between the traditional goal 
scale and the scale of traditional instructional activities is negative (r=-.18) and 
nonsignificant. The major reason is the lack of variation on these two scales: two-thirds 
of the teachers reported a major or moderate emphasis on three or more of the five 
traditional goals and 71 percem reported engaging m six or more of the nine traditional 
instructional strategies at least once or twice a week. 

Our data confirm what Cohen and Peterson (1990) found in their study of the 
California mathematics fi^ework-viz., that even teachers who endorse curriculum 
reform and implement it in their own classrooms do so by integrating the new with the 
traditional. Although close to half of the teachers in our sample (46 percent) report that 
they emphasize most of the reform goals in their teaching, only 12 percent of the sample 
engage in four or more reform instructional activities at least once or twice a week. 
However, we did identify seven teachers (10 percent of the sample) who reported a 
moderate or major emphasis on nine or more reform goals and who also reported using 
four or more reform-oriented instructional strategies at least once or twice or week. 
However, all but two of these seven teachers also use at least half of the traditional 
instructional strategies just as frequently. In other words, by their own self-reports, few 
respondents in our sample rely on the instructional strategies that mathematics reformers 
espouse for advancing reform goals these teachers seem to accept. Furthermore, even 



49 

the few respondents who might-by their own reports-be characterized as "reform 
teachers" still use traditional teaching strategies as part of their instructional repertoires. 
Consistent with the in:q)lementation patterns that characterize the adoption of many 
classroom innovations, these teachers are layering new practices onto their existing ones. 
CONSISTENCY BETWEEN THE SURVEYS AND THE ARTIFACTS 

The difficulty in interpreting data on instructional goals is further confounded 
when we conopare teachers' self-reports on the survey with the artifact data coded from 
their assignments and exams. Figure 53 shows the rate of agreement between the 
surveys and exams and between the surveys and assignments on the degree of emphasis 
that teachers gave each of the instructional goals. The level of consistenty between 
teachers' self-reports on the surveys and coders' depiction of their teaching gleaned from 
the artifacts was considerably less for goals than for either topic coverage or instructional 
strategies. However, the rate of agreement was slightly higher for assignments than for 
exams, perhaps because there were more data points from which to make inferences. 
On the whole, survey and artifact data were more consistent for traditional goals (four of 
the five traditional goals had rates of agreement above the mean for the entire list) than 
for reform goals (four of the 13 reform goals were above the mean). 

The major source of the discrepancies could be traced to the coders' very 
different judgments about the amount of emphasis that teachers were giving reform 
goals. For 12 of the 13 reform goals, coders indicated that 75 percent of the teachers 
had given these instructional objectives either a minor or no emphasis. This depiction of 
teachers' goals is generally consistent with the picture of their exams and assignments 
that they themselves provided in their survey responses. For example, the survey data 
presented in Figure 42 indicate that only a smaU proportion of teachers' exams require 
students to describe how to solve problems, explain their reasoning, or ^ly concepts to 
unfamihar situations. On the other hand, when asked to characterize their instruction 
through the lens of the goals they stress, teachers presented a very different picture. For 
only two of the reform goals (writing about mathematical ideas, designing a study or 
experiment) did an equaUy high proportion of teachers report a smaU emphasis, thus 
agreeing with the coders. As noted previously, close to a majority reported giving a 

ERIC ' 



Table 53 

Instructional Goals: Consistenqr betnven SurvQ^ and Exams 
and between Surv^ and Assignments 





Exams 


Assignments 


%Dircct 


% Within 
One Survey 

Category 


% Direct 
AzTcement 


% Within 
One Survey 
ResDonsc 
Categoiy 


Understanding the natuir of proof 




70 


37.5 


79.7 


Memorizing facts, rules, and steps 




78.7 


297 


797 


Learning to represent problem structures in multiple 
ways (e^ graphically, algdiraically, numerically) 


UA 


492 


203 


60.9 


Intc^ting different branches of mathematics 
(e^., algebra, geometry) into a unified fiamewoik 


lAji 


in 1 
39j 


213 


5i.l 


Conceivin£ and analvzins the effectiveness of different 
approaches to problem sohong 


33 


262 




32.8 


Performing calculations with speed and accuracy 


302 


873 


302 


873 


Showing the importance of math in daily life 


6jS 


312 


172 


48.4 


Solving equations 


24JS 


64 


29.7 


78J 


Raising questions and formulating conjectures 


5£ 


28.9 


11.5 


423 


Increasing students' interest in math 


5 


1 O 

18 


9A 


37.5 


Integrating math with otiier subjects 


6.7 


483 


18 


63i> 


Finding examples and counterexamples 


9.8 


36.1 


11.1 


492 


Judging the validity of aiguments 


21.7 


50 


25.8 


58.1 


Discovering generalizations 


9^ 


34.4 


143 


42.9 


Representing and analyzing relationships using tables, 
charts, and graphs 


233 


61.7 


29 


71 


Applying mathematical models to real-world 
phenomena 


9.6 


50 


11.5 


51i> 


Writing about mathematical ideas 


41.7 


81.7 


43.5 


88.7 


Designing a study or experiment 


533 


90 


58 


91.7 


Writing equations to rqsresent relationships 


14.8 


54 


19.4 


58 


Solving problems for which there is no obvious method 
of solution 


173 


63^ 


212 


65.4 


Mean 


19 


53 


23.4 


62 



50 

moderate or major emphasis to most reform goals and for four goals, 75 percent or more 
reported doing so.^ 

It was these patterns-low levels of agreement between the survey and artifact 
data, more problems with reform than traditional goals, and teachers reporting a greater 
emphasis on reform goals than coders could detect-that initially led us to re-interview a 
subsample of teachers. One problem we discovered in these follow-up interviews is the 
different ways that teachers interpreted the response options (major, moderate, minor, 
and none) for the goals item. This same set was used on the NELS-SFU questionnaire 
and a variant of it has also been used on NAEP teacher surveys. But teachers interpret 
this response option quite differently. Some assumed that the underlying dimension was 
the frequency with which they undertook activities consistent with a particular goal, while 
others assumed that emphasis should be defined in terms of how important they 
considered a goal for their students' understanding, regardless of how often they 
undertook activities reflective of that goal. Other teachers combmed frequency and 
importance in their assessment of emphasis. 

Coders were instructed to base their judgments on the prevalence of tasks 
consistent with a particular goal-for reform goals, those tasks were identified from 
NCTM materials-and the goal's relative importance as compared with other objectives 
the teacher seemed to be stressing. The notion that some teachers might place a major 
emphasis on a goal but not incorporate it into many activities-e.g., by stressing it with 
great clarity and forcefiihiess at a few key points during the course-is not something that 
we could measiu-e well with artifacts. 

A second, and by far greater, problem is the different meanings that teachers 
ascribed to terms associated with the mathematics reform movement Table 5.4 
iUustrates differing interpretations of a reform goal which had the lowest level of 
agreement between the survey and the exam and assignment artifacts. At a general 
level, five of the six teachers interpreted the goal in a way consistent with its reform 



four goals for which 75 percent or more of the teachers reported giving them a moderate 
major emphasis arc displayed either in Figure 5.1 or listed in footnote 23. 



OS 
O 

O 
g 

< 

o 
o 

3 £" 



a 



•s 



a 



i3 




s 

i 



e 

I 



S 2 
— -g 

11 




j= 2 



o 

13 



o 
•c 



lis 
i'^f Is 

,rS 3 4> g 



ao 9 



i€ 6 



I 




f i 

a 
o 

1 § 

II 



8 

•S 



o 5. 

•s |: 

•5 c 



■o 



•I g S 



■o 

c 

CI 

•g 

8 



ST 



9-s 



BO 



S S. 
. 2i 



ERIC 



51 

meaning-Le^ encouraging more than one solution method But only the calculus 
teacher's discussion comes close to the notion of "conceiving," and most of the teachers 
seem to be interpreting problem-solving in a narrower sense of solving traditional 
mathematics problems, rather than strategies for sohong "real world" or non-routine 
problems. 

This exan:q)le of disparate interpretations is by no means unique. The previous 
chapters reported problems with other reform-oriented terms. Not only did teachers 
have differing interpretations of these terms, but in a number of cases they reported not 
knowing at all what the phrases meant 
CONCLUSIONS 

Our analysis suggests that instructional goals are too problematic to be validly 
measured through national surveys of teachers. TTie data are inconsistent not only with 
artifact data, but also with teachers' own self-reports on other survey items such as those 
describing their exam formats. These inconsistencies between teachers' reports about 
their goal emphasis and their instructional strategies are difficult to interpret It might 
be that the lack of a consistent relationship stems from the different meanings teachers 
ascribe to terms associated with the mathematics reform movement Or, it may be that 
acknowledging the importance of particular goals is a precursor to implementing 
instructional practices consistent with those objectives. Or, it might be that despite 
teachers' willingness to report candidly about their reliance on traditional instructional 
strategies, social desirability becomes a factor in talking about their philosophy of 
teaching. These are among a number of plausible cxplanatioas for the disjuncttue 
between teachers' reported goals and classroom practice. However, at this point we do 
not know which actually account for the inconsistencies. As a result, survey data on 
instructional goals cannot be unambiguously interpreted. 

Consequently, we would recommend that questions about teachers' instructional 
goals be deleted from national surveys. These items could then be replaced with more 
detailed measures of topic coverage-thus improving the amount and quality of data on 
the most central aspect of curriculum, without greatly increasing respondent burden. At 
least in the short-term, data on teachers' goals might be more effectively gathered 



Er|c Vj 



52 

through smaller, supplemental studies. They might be collected as part of a validation 
study so that teachers' self-reports could be compared with their instructional artifacts; 
data might be collected using face-to-face, open-ended interviews, perh^ combined 
with classroom observations; or focus group and similar strategies might be used to 
probe the meanings that teachers ascribe to different goals. Interpreting survey data 
about attitudes and beliefs is always difficult, but in the case of teachers' goals, the 
dangers of misinterpretation seem particularly high and appear to outweigh the value of 
obtaining information through a relatively incjqjcnsive, broad-based method. 



53 

Chapter 6 

DESIGN CHOICES FOR IMPROVED CURRICULUM INDICATORS 
As curriculum assumes greater prominence on the education poliqr agenda, the 
demand for better indicators will continue. As a result, three questions face those 
responsible for the design and operation of educational indicator systems: 
o how will curriculum indicator data be used, 
o how much do various users need to know about curriculum, and 
o what is the most effective design for coDecting curriculum indicator data? 
These are difficult questions to answer, and judgments about appropriate directions will 
be shaped as much by poUtical values and resource constraints as by technical 
considerations. Nevertheless, the findings and conclusions from this project can help 
inform the decision process. 
USES 

The potential uses of curriculum indicator data could conceivably range from the 
kind of national snapshot now provided by NAEP, NELS, and other similar national 
surveys to the high stakes appHcations inq>Ued in some proposed uses of opportunity-to- 
learn standards (McDomiell, 1995). We would argue that an enhanced version of 
existing surveys will provide a reasonably valid depiction of the mathematics curriculum 
in this country. Nevertheless, there will be two major limitations: the characterization 
will be a rather general one and it may not provide a very accurate picture of either 
teachers' intentions or practices with regard to curriculum reform. Still, it is possible to 
obtain sound information about the depth and breadth of course content, how it varies 
across courses and types o.f schools, and a better indication of teachers' instructional 
repertoires than is currently available. 

However, despite the improvements that can be made in surveys over the next 
several years, we do not beUeve that the information collected wiU meet the necessary 
criteria for high stakes uses. The data wiU still be at such a level of generality that they 
camiot be used to make valid determinations about the aUgmnem of individual schools 
with any type of content standards. Yet due process would require that vaUd and 
reliable measures of each school's curriculum be established before it could be held 



ERIC 



7 i 



54 

accountable for its instructional activities. Given the measurement and interpretation 
problems we have identified, we do not beUeve that curriculum indicator data could 
meet such a legal standard in the near future.^ Therefore, the most appropriate uses 
will continue to be informational ones. Curriculum indicator data can provide a general 
picture of the distribution of OTL across different types of schools and smdents and it 
can chart overall trends in curricuiar practice, but it cannot serve as the basis for 
decisions with potentially serious consequences for schools and teachers. 
INFORMATION NEEDS 

The second question raises the issue of what should be included in the domain of 
curriculum. In our study, we focused on content coverage, instructional strategies, and 
goals, and most indicator designers recommend some variant of these three categories. 
However, these categories are largely teacher-centered, and do not directly measure the 
role of students in constructing knowledge. Measuring active student learning greatly 
complicates both the measurement and the data collection task, and would likely 
necessitate more data than can be obtained from teacher surveys. However, it may soon 
be possible to consider another potentially large data base as a source for curriculum 
indicators. The inaeased use of student portfolios by states and local districts provides 
an opportunity to experiment with using them not just as the basis for assessing students, 
but also as sources of information about the nature of the teaching and learning process. 
Up to this point, research on smdent portfolios has focused on scoring them as measures 
of student achievement, but a parallel development effort could focus on how to extract 
data that might serve as indicators of the types of instructional strategies being used and 
students' role in those activities. 

Even if the curriculum is defined more narrowly in terms of the three categories 



Another issue that would arise if curriculum data were coUected for high stakes purposes reUtes to the 
quahty of teachers' survey responses. We found few sodal desirability problems in their responses 
However, our surveys were administered under veiy low stakes conditions. All the research showinc that 

^attr ffr^'^A u- ^^^"^ " ""^ °f '^^'^ «scssments sfronidy 

suggests that under high stakes conditions, teachers would likely bias their responses. They might 6nd itb 

roJiL^?"? ^° "^"^ "f.^"^ policymakers' expectations, thus corrupting thebformation 

coUccted As a result, vahdauon studies would need to be conducted much more frequent? than if the data 
were only for mformaUonal purposes and no direct consequences for teachers were atUched to its use 



ERIC 



ERIC 



55 

we used and confined to indicators that can be effectively measured through teacher 
surveys, the level of detail desired within each of these categories can vaiy considerably. 
Given the relationship between students' cunicular exposure and their achievement, as 
well as our study results showing that surveys can provide reasonably accurate measures 
of topic coverage, we recommend that future national surveys place a greater en^hasis 
on topic coverage. Not only are the topics currently included on national teacher 
surveys too few and too general to provide a vaUd picture of OTL, the information they 
generate is virtuaUy useless in understanding cunicular trends. Future items on topic 
coverage should be tailored to specific course levels, and should include more topics at a 
greater level of specificity. Our post-data coUection questionnaire is an example of such 
an enhanced survey. 

Although we would accord it lower priority, we also recommend including a more 
comprehensive set of items dealing with instructional strategies. Asking teachers about a 
broader range of classroom practices would provide better information about the 
different ways that they combine strategies and how they integrate newer practices into 
their traditional repertoires. The findings from our study and a number of others 
indicate that teachers rely on only a few traditional strategies. Yet the expectation 
continues that they will adopt a variety of instructional reforms. Whether that 
expectation is met or not remains an open question But asking teachers about only a 
few traditional and a few reform practices ignores the reality of policy implementatioa 
If teachers do adopt the instructional strategies advocated by reformers, it will be 
through a process of adaptation and layering (Darling-Hammond, 1990). Without a 
fairly comprehensive set of instructional practice items, it will be difficult to determine 
exactly what these hybrid repertoires look like or how consistem they are with reformist 
guidelines. 

We recognize that our recommendations would require additional time for survey 
administration and hence increase respondem burden. This trade-off between improved 
data quality and respondem burden is a particular problem in the case of national 
surveys used to coUect a variety of differem data from the same respondents. However, 
as we argued in the previous chapter, teachers' instructional goals cannot be validly 



56 

measured through survey data. Therefore, the additional burden associated with an 
enhanced survey on topic coverage and instructional strategies could be reduced 
somewhat by eliminating those items dealmg with instructional goals. 
DATA COLLECTION STRATEGIES 

Decisions about use and scope will largely determine data collection strategies. 
Our findings suggest three areas of possible investment The first has already been 
discussed: inq)roving the design of national surveys. In addition to chaiaging the relative 
emphasis accorded different aspects of curriculum, a number of suggested changes in 
item wordings and response option scales were outlined in previous chapters. These 
changes can be implemented quite cost-cffidently. 

A second area of future investment are indepth studies on small samples of 
teachers and classrooms to monitor changes in mathematics teaching. These studies 
would use techniques that can measure instructional processes with greater subtlety than 
is possible through surveys. The more con^letc, nuanccd data about such issues as 
teachers' understanding of reform goals and their differem uses of reform strategies can 
then be used to interpret survey results and to improve the design of future surveys. 

The final area for future investment is the one that has been the primary focus of 
this study. We beUeve that the kind of validation study we have piloted should be 
integrated into the design of curriculum indicator systems. The primary, and most 
pressing, reason for such vaUdation studies is the current reform context Proposed 
changes in curriculum coment and instructional practice mean that the language of 
mathematics teaching is in flux, and teachers do not share a common underetanding of 
key terms. The effect is likely to be either a serious misinterpretation of survey results 
or an inability to interpret them at all. The solution is to make problematic survey items 
clearer through the use of more precise definitions and concrete examples. Howeier, as 
we noted in the case of instructional strategies, so few teachers have adopted the new 
approaches that it is difficult to test alternative survey question wordmgs or expcrimem 
with new measures. ConsequenUy, until language and practice have stabilized, validation 
studies (perhaps combined with indepth case studies and focus group interviews) will 
need to be an integral part of curriculum indicator systems. 



57. 

Although current interest in curriculum reform and hope for its widespread 
implementation provide the primary rationale for validation studies, they would still be 
needed even in more stable times. By collecting detailed data from multiple sources 
over shorter periods of time, vaUdation studies can provide benchmarks against which to 
judge both the vaUdity and reliabiUty of survey data. It is only with such data that we 
can know whether teachers are reporting reliable estimates of topic coverage or whether 
their characterizations of exams and assignments are accurate. Such independently- 
collected information helps not only in interpreting survey data, but also identifies 
sources of measurement error and informs the design of future surveys. 

However, validation studies do not have to be conducted every time a national 
survey is administered. Ratiier, we would recommend conducting one only when a new 
survey effort is begun~c.g., at tiie begmning of a longitudinal study like NELS or when 
major design changes are implemented in the NAEP teacher survey. The validation 
study would tiien be conducted as part of tiie first administration of tiie survey, with such 
efforts required only every five years or more. 

Although we would recommend several modifications in tiie procedures used in 
our pilot study, we believe tiiat tiie basic structure is sound. The instructional artifacts 
worked well as benchmarks and despite some obvious limitations, were easily coUected 
from teachers. Altiiough coding artifacts to extract information comparable to tiiat 
coUected from tiie surveys was a difficult task, we now have a template tiiat can be 
improved upon and replicated quite easily. Given what we have learned from tiie pilot 
study, we are confident tiiat tiie level of inter-rater agreement can be increased. The 
coding specifications can now be made more precise, and tiie coding process organized 
so tiiat coders' work is reviewed more frequentiy tiirough a moderation process tiiat 
identifies discrepant judgments and makes appropriate adjustments. The coding of 
instructional artifacts will never be as reUable as, for example, tiie scoring of open-ended 
test items because tiie type and mix of material is mistandardized across teachers. 
Nevertheless, we beUeve tiiat by using the survey categories as tiie basis for a content 
analysis of tiie artifacts and by closely monitoring tiie coding process, high quality 
benchmark data can be obtained. 



58 

In order to make valid conq)arisons aaoss courses, future validation samples will 
need to be somewhat larger-probably about twice as large as for the pilot study. 
However, given that there is less variation in the curriculum of upper-level course such 
as calculus and that policy concerns about opportunity-to-leam are greatest in lower-level 
courses, one option might be to concentrate the study's focus on coiurses at or below 
algebra BL A particular emphasis might be on lower level courses such as pre-algebra 
and on those that integrate topics across traditional course categories. 

The similarity in our findings about teachers' instructional practices with those 
from larger, nationally-representative samples suggests that our smaller sample is 
generally reflective of high school mathematics teaching. However, in order to avoid 
idiosyncracies that might characterize the teacher force in one or two states, future 
validation studies should include teachers from a larger number of states. For example, 
the proportion of California mathematics teachers who have a college major in 
mathematics is considerably below the national average (44 percent in 1991, as compared 
with a national average of 69 percent) (Blank and Gruebel, 1993).** With the 
modifications outlined, the basic approach used in this pilot study should serve as an 
effective template for future validation smdies. 

Over the past decade, the quality of education indicators has steadily improved, 
with the greatest progress made in indicators of school and classroom processes. The 
"black box" that characterized older input-output models has been replaced with an 
increasingly comprehensive set of indicators that can report national trends in school 
organization and curriculum. But the failure to validate these indicators has remained a 
problem. Because items are typically transferred fi-om one survey to another with no 
attempts at validation, the extent to which they measure how students are acmally taught 
was virtually unknown. This study represents a first step in ensuring that curriculum 
indicators are valid and reliable measures of the instruction occurring in the nation's 
classrooms. 



1 



focused on California because we assumed that the state's innovative curriculum frameworks 
would mean that more reform-oriented teachers would be included in our sample. However, like many 
others, we underestunatcd how difficult and slow implementaUon of the frameworks would be 



I ERIC . ^ 



Mit;i''iri{ii;N( 



HlMiifc. II K . A ( 1 1 ( mn) SutiP itulU ,iU»\ I./ ti ipm p ,m,l moftmunth t 
pthit iiliot, WH«liiii||lon, I M ( oiiiu'il of < '!il«»f U\M^ NrJiool { »fn« ru 

( iilirniniH l)«^|>Hiitii(«iil of I'lltiiMiioii (m}) MinhpmiilUx fhin,P»-i„f, f,» ( alifvntio 
puUUt MluHth Nhi iHiiir^iiio. ( A Auilioi 

< nlH'ii. n K . A IVl*i|«nn. I' I , ( I WO) Inl Iapiup of /u/,,, VyoUmlion ,m,l 

hiitlin^. ilnintnoml, I InnMnHloiiMl polhy Itiio pittillrr "Hip powri of 

ImMIoim ov<»i Hip top " tuhmnUmnl I vnluiilitm iitui htlUy Atuilnh. i; ( 1). H / 

rirrmnn. 1)1. Kiilm. I M, I'oilr!. AC. riml»^n. I( I' . Imilill. W 11 Sriiwlllr. f M 

( !«n») ho it>iiiliook» niul irnin il<»nit»« nii tiHlionnI rHiiiniliim in rlriiipniMiy'»i | | 

inMllirmiili« n7 NptttPtUnty \t fun tl Journal, H \ (1). noi ni \ 

U.MMlliojj. II (ly-M. Mm. I. /») A .Ilry in,Mi|« lulu, „li,m WppK, \\\\{}U), 



Mrtip. I ( lUO^. Mny IH) I hr plot llitrkniin; l |ir> i^hI ,|,,n„„ KriMn. fcy 

Mlii. nllon ipfoiin m i mny Iimvp |>o|tNti Mut alim WppK. p|t /» /n 

V. Ihivn I. 11 C. Hiv«on. A.llpU.iiiR. I. /wirk. H ( im) MiilhrniMH, . 

itiKl M ii^iirr if^M Moiro i«« to ( oiiikmi litkoii in UifiU «t l,.,ul un.l olliri 

Inrloiij htuttuti of l.dmiiliotuil MpituitPfUPttt . !\ (»). lU/ ;(IH 

Kiln. I' , ( ( )p,M»jlnnillr!t. tHlrnli ttnil pNilit ipniion hi I HiiiMrin (I'll ). Ihr II A 
Mutlv oi nutthpttutU, \ III SluAptu Ktowth ,m,l tUmtoom ptoiPxwx (pp }N S{)}) 
Now Yoik: l'rijt«iiion 

MrDonnrll. I M . UniMHn. I . Oi iiurlh. I . ( M|p,«||. } M , ft, Moo«ly I) 

Ihu ovniuH ^'lutt u hooh m,llv inu h Dpyiftninn im,„ovPil , ounpwotk iwlh utotx 
?.«iiln MoniiH, < A HANI) 

M. Dnnnrll. I M (IW) ( )ppu„iM.ily to lr«,n m ,i ,r.,r,M, li n„„rpl „,mI h pnliiy 
inmmmi^til l<,lw atiotutl hvtthuilion atui VnlU v .ituilv%i\ 

M. I m.Khlln. M W (lUW) Ihr H ANI ) Clmn^r Aarnl S.i.ily .rvinilr.l M.mo 
prinpniivrn aiul iiilno irjilillrH l;lw uUonnI Krmitthr,, | ! |(, 



ERIC 



60 



McKnight, C. C, Crosswhite, F. J., Dossey, J. A., Kifer, Swafiford, J. Travers, K J., 
& Cooney, T. J. (1987). The underachieving amicuhan: Assessing U.S. school 
mathematics from an iniemational perspective. Champaign, IL Stipes Publishing. 

Merl, J. (1994, May 6). Furor continues to build over state's CLAS exams. Los Angeles 
Tunes, pp. Al, 18. 

Mullis, L V. S, Dossey, J. A^ Owen, E. tt, & Phillips, G. W. (1991). The state of 
mathematics achievement: NAEP's 1990 assessment of the nation and the trial 
assessment of the states (OERl Rep. lio.21'ST-M). Princeton, NJ: Educational 
Testing Service. 

Mullis, I. V. S, Jenkins, F., & Johnson, E. (1994). Effective schools in mathematics: 
Perspectives from the NAEP 1992 assessment (OERI Rep. No. 23-RR-Ol). 
Washington, D.C: U.S. Government Printing Office. 

Mumane, R. J., & Raizen, S. A. (1928). Improving indicators of the quality of science and 
mathematics education in grades K'12. Washington, DC: Natio aal Academy Press. 

National Council on Education Standards and Testing. (1992). Raising standarris far 
American education. Washington, DC: U.S. Government Printing Office. 

National Council of Teachers of Mathematics. (1989). Curriculum and evaluation 
standards for school nmthematics. Reston, VA: Author. 

National Council of Teachers of Mathematics. (1991). Professional standards for teaching 
mathematics. Reston, VA: Author. 

National Research Council (1989). Everybody counts: A report to the nation on the 
future of mathematics education. Washington, DC: National Academy Press. 

Oakes. J. & Carey. N. (1989). Curriculum. In RJ. Shavelson, L.M. McDonnell, J. 
Oakes (Ed.) Indicators pr monitoring mathematics and science education: A 
sourcebook. Santa Monica, CA: RAND. 

National Study Panel on Education Indicators. (1991). Education counts: An indicator 
system to monitor the nation's educational health. Washington, DC: U S 
Government Printing Office. 

O'Day. J. A., & Smith, M. S. (1993). Systemic reform and educational opportunity. In S. 
H. Fuhrman (Ed.), Designing coherent education policy (pp. 250-313) San 
Francisco: Jossey-Bass. 



61 



OERI State Accountability Study Group. (1988). Creating responsible and responsive 
accountability systems. Washington, DC: U.S. Department of Education. 

Owens, M. R. (1994, March 23). The name of the camel is truth. Education Week, 
XIII(26), 35-36. 

Porter, A. C (1991). Creating a system of school process indicators. Educational 
Evaluation and PoUcy Analysis^ 13(1), 13-29. 

Porter, A. C, Kirst. M. W., Osthofi; E. Smithson, J. L, & Schneider, S. A. (1993). 
Reform up close: An anafysis of high school mathematics and science classrooms. 
University of Wisconsin-Madison, Wisconsin Center for Education Research, 
School of Education. 

Raizen, S. A., & Jones, L. V. (Eds.). (1985). Indicators ofprecoUege education in science 
and mathematics: A preliminary review. Washington, DC: National Acadernty Press. 

Ravitch, D. (1995). National standards in American education. Washington, DC: 
Brookings. 

Rothman, R. (1993, ^ril 7). 'Deliveiy' standards for schools at heart of new policy 
debate. Education Week, p. 21. 

Schmidt, W. H., Wolfe, R. G., & Kifer, E. (1993). The identification and description of 
student growth in mathematics achievement In L. Burstein (Ed.), The lEA study 
of mathematics III: Student growth and classroom processes (pp. 59-75). New 
York: Pergamoa 

Shavelson, R, McDonnell, L. M., Oakes, J., & Carey, N. with Picus, L. (1987) Indicator 
systems for monitoring mathematics and science education. Santa Monica, CA: 
RAND. 

Shavelson, R. J., McDonnell, L. M., & Oakes, J. (Eds.). (1989) Indicators for 

monitoring mathematics and science education: A sourcebook. Santa Monica, CA: 
RAND. 

Travers, K. J., Garden, R. A., & Rosier, M. (1988). Introduction to the study. In D. F. 
RobitaiUe and R. A. Garden (Eds.), The lEA study of mathematics II: Contexts 
and outcomes of school mathematics (pp. 1-16) New York: Pergamon. 

Travers, K. J., (1993). Overview of the longitudinal version of the second international 
mathematics study. In L. Burstein (Vol. Ed.), The lEA study of mathematics Ill- 
Student growth and classroom processes, (pp. 1-27). New York: Pergamon, 



ERIC 



62 

Travers, K J. and Westbuiy, I. (Eds), The lEA study of mathematics I: Analysis of 
mathematics amicuia (1989). New York: Pergamon. 

Weiss, L (1994). A profile of science and mathematics education in the United States: 
1993. Chapel Hill, NC: Horizon Research, Inc. 

Wittrock, M. C (Ed). (1985). Handbook of research on teaching (3rd ed). New York: 
Macmillan. 



63 



APPENDIX 

Initial Teacher Survey (administered prior to artifact data collection) 
Enhanced Teacher Survey (administered after artifact data collection) 

Daily Log Form 



64 



VAUDATING NATIONAL CURRICULUM INDICATORS 
INITIAL TEACHER SURVEY 



This questionnaire asks for some initial inforr<iation about the goals, content, and instmctional activities 
in the class that has been chosen for the RAND/UCLA study on validating curncuium indicators. This 
information, along with the instructional materials you will be providing, will help in describing students' 
educational experiences. 

The survey includes questions about characteristics of the class, teaching strategies, curncuium 
content, and general information about your teaching experience. 

Please mark your responses directly on the questionnaire. Place it in the envelope with your dass 
assignments for the first week, and return it to RAND. 



THANK YOU FOR YOUR CONTRIBUTION TO THIS STUDY. 



ERIC 



65 



2 



Class Information 



identity Code: 



Class Title: 



How many students are enrolled In this class? 



No. ol Students: [ 



How many students In tills class are from minority raclal/etlmlc groups (e.fl., Blacic, Hispanic, 
Asian)? (if unsure, give your best estimate.) 



No. of Students: 



Whlcii of the following best desicribes the level this class is considered to be? 



(Circle One) 

Remedial i 

Genera! 2 

Voc/Tech/Business 3 

College Prep/Honors 4 

AP 5 

Which of the following best describes the achievement level of the students in this class 
compared with the average student in this school? 

(Circle One) 

Higher achievement levels 1 

Average achievement levels 2 

Lower schievement levels 3 

Widely differing achievement levels 4 



Approximately how much homework do you typically assign each day to this class? 



Minutes: 



66 



8. How often do you do each of the following wKh homework assignments? 

(Circle Om Number on Each Line) 



a. Keep records of who turned 
in the assignment 



b. Return assignments with grades 
or corrections 



c. Discuss the completed assignment 
in class 



Some of Most of All of 
Never the Time thenme the rime 



9. Approximately how many minutes per week does this class meet regularly (not Including lab 
periods}? 



10. 



Minutes: 



Approximately how may minutes per week does this class have lab sessions? 
(If there Is no lab, enter DO.") 



Minutes: 



11. 



Indicate about what percent of class time Is spent In a typical week doing each of the followlna 
with this class. " 



a. Providing instruction to the 
class as a whole 



b. Providing instruction to small 
groups of students 



c. Providing instruction to 
individual students 



d. Maintaining order/disciplining 
students 



e . Administering tests or quizzes 



(Circle One Number on Each Line) 

asm S1Q%. 10-24% 25-49% 50-74% 75-100% 
2 3 4 5 6 



f . Performing routine administrative tasks 
(e.g., taking attendance, maldng 
announcements, etc.) 



g. Conducting lab periods. 



2 
2 

2 
2 



3 
3 

3 

.? 

3 
3 



4 
4 

4 

4 



5 

5 

5 
5 

5 
5 



6 
6 

6 
6 



ERIC 



67 



4 

12. How often do you use the following teaching methods or media? 



(Circle One Number on Each Une) 

Never/ 1-2 Times 1-2 Times Almost 

BaiSiH a Month a week Everyday Evervdav 



a. 


Lecture 




2 


3 


4 


5 


b. 


Use computers 




2 


3 


4 


5 


c. 


Use audio-visual material 


1 


2 


3 


4 


5 


d. 


Have teacher-led whoie^jroup discussion 


1 


2 


3 


4 


5 


e. 


Have students respond orally to questions 


... 1 


2 


3 


4 


5 


t. 


Have student-led whole-group discussions 


... 1 


2 


3 


4 


5 


g- 


Have students work together in 

cooperative groups 




2 


3 


4 


5 


h. 


Have students complete individual written work 


., 1 


2 


3 


4 


5 


1. 


Have students give oral reports 




2 


3 


4 


5 



13. 



Indicate the Importance you gh/eto each of the following In setting grades for studems In vour 
classes (excluding special education studems). ^ 

(Circle One Number on Each Line) 

Not Somewhat Very 
Important Imnndant I mportant 

a. Achievement relative to the rest of the class i 2 3 

b. Absolute level of achievement 1 2 3 

c. Individual improvement or progress over past 

performance 1 2 3 

d. Effort , 123 

e. Class participation 1 2 3 

f. Compleiing homework assignments 1 2 3 

g. Consistently attending class 1 2 3 



ERIC 



68 



5 



For Math Teachers Only 



Those teaching science classes should SKIP TO QUESTION 16 on the following page. 

14. In this math class, how much emphasis do you give to each of the following objectives? 

(CIrcIo Om Number on Each Una) 

None Minor Moderate Major 

a. Understanding the nature of proofs 1 2 3 4 

b. Memorizing facts, niles, and steps 1 2 3 4 

c. Learning to represent problem structures 
in multiple ways (e.g. graphically, 

algebraically, numerically, etc.) 1 2 3 4 

d. Integrating different branches of math (e.g., 

geometry, algebra) into a unified framework 12 3 4 

e. Conceiving and analyzing effectiveness of 

multiple approaches to problem solving ^ 2 3 4 

f . Performing calculations with speed and accuracy 12 3 4 

g. Showing importance of math in daily life 12 3 4 

h. Solving equations 1 2 3 4 

i. Raising questions and formulating conjectures... 1 2 3 4 

j. Increasing students' interest In math 12 3 4 



15. Have you taught or reviewed the following topics In this math class during this year? (If you 
have reviewed and taught an item as new content, mark «3 only.) 



a. 
b. 
c. 
d. 
e. 

f. 

g- 

h. 



k. 



(Circle One Number on Each Line) 



No, but it was 
was taught 
previously 



Integers , 

Patterns and functions . 

Linear Equations 

Polynomials 



Properties of geometric 
figures 



Coordinate Geometry. 

Proofs 

Trigonometry 

Statistics 

Probability 

Calculus 



Y«s. but I 
revlewAd 

it only 

2 
2 
2 
2 

2 
2 
2 
2 
2 
2 
2 



Yes, but I 
taught it 
as new content 

3 
3 
3 
3 

3 
3 
3 
3 
3 
3 
3 



No, but I will 
teach or review it 

later this year 

4 
4 
4 
4 

4 
4 
4 
4 
4 
4 
4 



No, topic is 
beyond the 
scope of this entirsfl 

5 
5 
5 
5 

5 
5 
5 
5 
5 
5 
5 



ERIC 



69 



For Science Teachers Only 



Those teaching math classes only should SK/P TO mE SECTION UARKEDTBacher Background 



16. In this scienco class, how ntiuch emphasis do you gh/e to the following objectives? 



(Circle One Number on Each Una) 
None Minor Moderatft Major 



a. 


increasing students* interest in science 


1 


2 


3 


4 


b. 


Learning and memorizing scientific farts, 
principles, and mles 




2 


3 


4 


c. 


Learning scientific methods 




2 


3 


4 


d. 


Preparing students for future study in science 


.... 1 


2 


3 


4 


e. 


Develoi)ing problem solving/inquiry skills 


1 


2 


3 


4 


f. 


Developing skills in lab techniques 




2 


3 


4 


9- 


Leamin;; about applications of science 
to envimnmental issues 




2 


3 


4 


h. 


Showing importance of science in daily life 


.... 1 


2 


3 


4 



17. How Often tlo you do each of the following activities In this science class? 



d. 
e. 



9- 
h. 



Have stud ants do an experiment or 
observation individually or in small groups , 

Demonstr£ite an experiment or lead students 
in systemalic observations , 



Require stu^lents to turn in written reports 
on experiments or observations 



Discuss cun-ent issues and events in science 

Have students use computers for data 
collection and analysis 



Use computers fordemonstrationsysimulations. 
Have students give oral reports 



Have students Independently design and 
conduct their own science projects 



Discuss career opportunities in sdentific 
and technological fields 



Discuss controversial inventions and 
technologies 



(Circle One Number on Each Une) 

Never/ 1-2 Times 1-2 Times Almost 

BaifilX a Month a week Everyday Evervriay 



2 
2 

2 
2 
2 



3 
3 

3 
3 
3 



4 
4 

4 
4 
4 



5 
5 

5 
5 
5 

5 

5 



ERIC 



70 



7 



18. 



For btoiQ^ tPHffhPra: Have you taught or reviewed the following topics In this Biology class 
during this year? If you have reviewed and taught an Item as new content, martc «3 only. 



(Circle One Number on Each Une) 



No, but it was 
was taught 

previoualv 



Yos, but I 
ravwwad 

it only 



Y«s, but I 
taugittit 
asnaweontftnt 



No, but I will 
taach or r«vi«w it 

later tiiisvflaf 



No, topic is 
beyorxi th« 

seopflofthis miir^fl 



a. 


Cell structure and 


1 


2 


3 


4 


5 


b. 




1 


2 


3 


4 


5 


c. 


Diversity of fife 


1 


2 


3 


4 


5 


d. 


Metabolism and regulation 
of the organism 


1 


2 


3 


4 


5 


e. 


Behavior of the organism 


1 


2 


3 


4 


5 


f. 


Reproduction and development 
of the organism 


1 


2 


3 


4 


5 


g- 


Human biology 


1 


2 


3 


4 


5 


h. 




1 


2 


3 


4 


5 


i. 


Ecology 


1 


2 


3 


4 


5 



19. 



For Physics teachftr?? : Have you taught or reviewed the following topics In this Physics class 
during this year? If you have reviewed and taught an Item as new content, mark #3 only. 



(Circle One Number on Each Line) 



No, but it was 
was taught 
Dreviously 



Y«s, but 1 
raviawed 

it only 



Y«s, but I 
taught it 
as nawcontpnt 



No, but 1 will 
taach or review it 

later this vAar 



No, topic is 
beyond the 

scope of this miir.;a 



a. 


Forms and sources of 














energy 




2 


3 


4 


5 


b. 


Forces, time, motion 


, ...1 


2 


3 


4 


5 


c. 


Molecular/nuclear physics 


.... 1 


2 


3 


4 


5 


d. 


Energy/matter 














transformations 




2 


3 


4 


5 


e. 


Sound and vibrations 


.... 1 


2 


3 


4 


5 


f. 


Light 




2 


3 


4 


5 


g- 


Electricity and 














magnetism 




2 


3 


4 


5 


h. 


Soiids/fiuids/gases 




2 


3 


4 


5 



ERIC 



71 



8 



Teacher Background and Experience 



1. What Is your sex? 

Male 1 

Female 2 

2. Which best describes you? 

Asian or Pacific islarjder...... 1 

Hispanic, regardless of race 2 

Black, not of Hispanic origin 3 

White, not of Hispanic origin 4 

American Indian or Alaskan Native 5 



3. What is the year of your birth? 



(Last 2 digits): 




4. Counting this year, how many years In total have you taught at either the elementary or 
secondary level? 



K-6: 



7-12: 



5. Counting this year, how many years in total have you taught In this school? 



Years: 



>\0 

ERIC 

ummmmmm 



72 



6. What academic clegree(s) do you hold? 

(ClnlB All That Apply) 

No degree 0 — > SKIP TO 08 

Associate degree 1 —> SKIP TO 08, If only degree 

Bachelor's 2 

Master's 3 

Education spedafist or professional 

diplonta at least one year of work 

beyond master's level 4 

Doctorate 5 

First professional degree (e.g., M.D., D.D.S.) 6 



7. What ware your major and minor fields of study for your bachelor's degree? 

(Circle All That Apply) 

Major Minor 



a. Education -j -j 

b. Mathematics 2 2 

c. Natural^hysicai sciences , 3 3 

d. Lile/biological sciences 4 4 

e. Computer science 5 5 

f . Foreign language 6 6 

g. English 7 7 

h. History (or social science) 8 8 

i. Other 9 9 



'>0 



ERIC 



73 



10 



8. Circle the number beside any of the following subjects which you have taught this year. 

(ClKlB All That Apply) 

MATHEMATICS 

Genera! Math 01 

Pre-Algebra 02 

Algebra I 03 

Algebra II 04 

Geometry 05 

Trigonometry 06 

Pre-Calculus 07 

Calculus 08 

Consumer/Business Math 09 

AP Calculus 10 

Other Math ii 

SCIENCE 

General Science 12 

General Physical Science 13 

Earth Science 14 

Principles of Technology „... 15 

Biology 16 

Chemistry 17 

Physics 18 

AP Sdence 19 

Other Science 20 

OTHER 

Connputer Science 21 

Other non-math, non-science 

course 22 

Please describe 



Date completed: 



/ 



MO 



DAY 



YR 



Thank you for your assistance. 

Please return this survey in the same envelope 
v^ith your first week's instructional materials. 



74 



00000 

VALIDATING NATIONAL CURRICULUM INDICATORS 
MATHEMATICS TEACHER SURVEY 

00000 



As part of the larger study to examine different ways of measuring 
curriculum trends in schools, this questionnaire asks you to report on the 
goals, content, and instructional activities in the class for which you have 
been providing us with your instructional materials. Specifically, it asks 
about the curricultun content covered, the teaching strategies and 
instructional practices used, and your goals, objectives and general beliefs 
about the way mathematics should be taught to this class The information 
you provide, along with other data already collected, is intended to describe 
students' educational experiences. Also, because this study will inform 
future efforts, space is provided at the end of the questionnaire for your 
comments on any problems or recommendations. 

Please MARK YOUR RESPONSES DIRECTLY ON THE QUESTIONNAIRE. 
Place it in the envelope vdth your instructional materials for this week, and 
return it to RAND. 



THANK YOU FOR YOUR CONTRIBUTION TO THE STUDY. 



.So 



ErJc™ ^ 9/17/92 



75 



Class Characteristics 
Please provide the following information about the specific class listed below. 

Designated dass: 00000 

1. How many students are in this dass? Total 00000 

Females 00000 

Males 00000 

2 . How many of the students in this dass are in the following grade levels? (Sum should 
eqiial total number of students given above.) 

a. 9th grade 00000 

b. 10th grade 00000 

c. 11th grade 00000 

d. 12th grade 00000 

3 . Whidi of the following best describes the adiievement level of the students in this dass 
in comparison to the average student in this sdiool? (Cirde one.) 

This class consists primarily of students with: 

Higher achievement levels 1 00000 

Average achievement levels 2 00000 

Lower achievement levels 3 00000 

Widely differing achievement levels 4 00000 



4. How many of the students in this class are of limited or non-English speaking ability? 



00000 



.4J 

m^^.. Form 1 9/17/92 



76 

How many of the students in this class are members of the following ethniotadal groups? 
(Stun should equal total given above in question 1.) 

a. American Indian or Alaskan Native 00000 



b. Asian or Pacific Islander 

c. Hispanic, regardless of race 



00000 
00000 



d . Black (not of 'Hispanic origin) 00000 

e. White (not of Hispanic origin) 00000 

f. Other (specify) 00000 

How many students in this dfl s^; are likely to do the following in the future? (Sum should 
equal total given above in question 1.) 

a . Attend a 4-year college 00000 

b. Attend a 2-year college/technical school 00000 

c. £nd formal education with high school 00000 

d . Not graduate from high school 00000 



Cuiriculum Coverage 
Please answer the following questions about the content you taught this class. 

7. What was the primary text used in this class? . 

Title: 00000 

8 . What chapters do you plan to cover by the end of this semester? 

Chapters: 00000 

How closely did you follow the text? (Describe your use of the text below.) 

00000 

9 . What additional chapters do you plan to cover over the course of this year? 

Chapters: 00000 



lu'J 



ummmmmm 



77 



You win find a list of topics on this page and the next 2 pages. Please respond to the 
following questions for each of the topics listed 

1 0. Have you tau^t or reviewed the following topics during this year in this class? 
{Circle your response.) 

1 = No, but it was taught previously. 

2 = Yes, but I reviewad it only . 

3 = Yes, I taught it as new content (includes new topics which will be reviewed later). 

4 = Not yet, but I will teach or review it later this school year. 

5 = No, topic is beyond the scope of this course or not in the school curriculum. 

1 1 . Indicate the approximate number of periods devoted to each topic below. If you focus on a topic 
for 10 or 15 minutes on a given day, count that as a period. If you will teach or review a topic 
later this year, indicate the number of periods you anticipate spending on the topic (Circle your 
response.) 

1 = None (zero) 

2 = One or two periods 

3 = Three to five periods 

4 = Six to ten periods 

5 = Mere than two weeks but less than one month (11 to 20 class periods) 

6 = One month or more (more than 20 periods) 



ERIC 



78 



10. Taught orrev-iewed? n. Periods on topic? 



Topics : 






a . 


Patterns and functions 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




b. 


Estimation 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




c . 


Proportional reasoning 


1 


2 


3 


4 


5 


1 


2 


■ 3 


4 


5 


6 




d . 


Proofs 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 






e . 


Tables and charts 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




f . 


Graphing 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




5- 


Math modeling 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 


* 


h. 


Ratios, proportions, and 
percents 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




i 


Conversions among fractions, 
decimals and percents 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




j ■ 


Lav/s of exponents 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 


* 




Square roots 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




1 


Polynomials 


1 


2 




4 


5 


2_ 


2 


3 


4 




c 

c 


* 


iti • 




1 




3 


4 


5 


1 


2 


3 


4 


5 


6 




n . 


Slope 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 




o . 


Writina ecuations for lines 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 






P • 


Inecraalities 


1 


2 




4 


5 


1 


2 


3 


4 


5 






c . 


Quadratic equations 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 


* 


r . 


Applications of measurement 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 






formulas (e.g. area, volume) 
























s . 


Properties of geometric 
figures 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 


« 


t . 


Pythagorean Theorem 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


6 



I 



ERIC 



79 



10. Taught or reviewed? 

:6 



11. Periods on topic? 



u. Coordinate geometry 

V. Probability 

w. Statistics 

* X. Distance, rate, time problems 
y. Growth and decay 

** 2. Transformational geometry 

** aa. Logarithms 

** bb. Conic sections 

** cc. Trigonometry 

** dd. Polar coordinates 

** ee. Sequences 

** ff. Complex numbers 

** gg. Vectors 

** hh. Matrices and matrix operations 

♦* ii. Calculus 

** j j . Limits 

** kk. Integration 

** 11. Fundamenral counting principle, 
permutations, combinations 

** mm. Measures of dispersion (range 
variance, standard deviation, 
etc. ) 

** nn. Discrete math (e.g., Euler 
circuits, directed graphs, 
trees) 

* indicates topic in Form I only. 
** indicates topic in Form II only. 



ERIC 




1 2 
1 2 



1 2 
1 2 
1 2 



4 5 
4 5 



1 2 3 4 5 
1 2 3 4 5 
1 2 3 4 5 



4 5 
4 5 
4 5 



1 2 3 4 5 



1 2 3 4 5 



liio 




1 2 3 4 5 6 

1 2 3 4 5 6 

1 2 3 4 5 6 

1 2 3 4 5 6 

1 2 3 4-5 6 

1 2 3 4 5 6 

1 2 3 4 5 6 

1 2 3 4 5 c 

1 2 3 4 5 f 

1 2 3 4 5 6 

1 2 3 4 5 5 

1 2 3 4 5 6 

1 2 3 4 5 6 

1 2 3 4 5 5 



12 3 
1 2 3 



1 2 3 4 5 



1 2 3 4 5 



80 



12. Vnr Mrh item helow, ploAw indicatg the types of shident unrfprstandin^ vtau expect from 
xnzgority of this dass by the end of the (XJurse, (Cixx:le the hi^est number that applies.) 

1 = Recognizes/knows the rule or principle 

2 = When given the rule or principle, is able to use it 

3 = Knows when and how to apply the rule or principle 

4 = Can both apply the rule or principle and explain why it works as it does 

5 = Not applicable — rule or principle beyond the scope of this class 



a. Division by zero is not allowed: — 
is undefined for all numbers a 



b. In a plane, the sum of the angle measures in 
any triangle is 180 

c. The area of a triangle: A = -^bh 

d. The Pythagorean Theorem 

e. The slope of a vertical line is undefined 



f. The distance formula: d = "sj (Xg-Xj^)^ + (.y2-yi)'^ 12 3 4 

h. If r = f . then ad = be 12 3 4 

b 0 

h. (a-fb)2 = a2 + 2ab + b2 12 3 4 

i . The product rule for exponents: a'^-a" = a"^"*"" 12 3 4 

j . The square root of a negative number 12 3 4 
is not a real number 

k . The log of a negative number is not defined 12 3 4 

1 . A continuous function need not be differentiable 12 3 4 



81 



Instructional Practices 



Please answer the following questions about the organization, t eac hing strategies and 
instructional practices you used with this class. 

13. How often do you use each of the foDowingiiistmctional strategies with (The strategy 
need not take the entire dass period.) 

Almost Once or Once or Once or 

every twicea twicea twioea 

day week month semester Never 



a . Lecture 

b. Have students respond orally to 
questions on subject matter 

c. Have student-led whole group 
discussions 

d. Have teacher-led whole group 
discussions 



2 
2 



3 
3 



4 
4 



5 
5 



00000 
00000 

00000 

00000 



e. Correct and/or review homework 
in class 



00000 



f. Demonstrate working an exercise 
at the board 



00000 



g. Have students work exercises 
at the board 



00000 



h. Have students work individually on 
written assignments or worksheets 
in class 



00000 



i . Have students give oral reports 1 

j . Administer a test (full period) 1 

k . Administer a quiz 1 

1. Use manipulatives (e.g., conic section 1 
models) to demonstrate a concept 



2 
2 
2 
2 



3 
3 
3 
3 



4 
4 
4 
4 



5 
5 
5 
5 



00000 
00000 
00000 
00000 



(Continued on next page.) 



O Form I 9/17/92 

ERIC 



82 



Almost Once or Once or Once or 
every twice a twioea twice a 
day week month semester Never 



m. Discuss career opportunities 
in mathematics 



5 00000 



n. Have small groups work 
on problems to find a joint 
solution 

0. Have whole class discuss solutions 
developed in small groups 

p. Have students practice or drill 
on computational skills 

q . Have students work on problems for 
which there is no obvious method of 
solution 

r. Have students represent and analyze 
relationships using tables and graphs 

s . Have students use calculators to solve 
exercises or problems 

t. Have students use computers to solve 
exercises or problems 

u. Have students respond to questions or 
assignments that require writing at 
least a paragraph 

V . Have students keep a mathematics 
journal 

w. Have students read textbooks or 
supplementary materials 

X . Have students work with manipulatives 

y . Have students work on next day's 
homework in class 



z. Summarize main points of today's 
lesson 

aa. Have students work on projects in class 



2 
2 



3 
3 



4 
4 



5 00000 

5 00000 
5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 

5 00000 



ERIC 



Form I 9/17/92 



83 



14. Indicate what tvnvpmt nf Hbw t^mA is spent in a tvnicnIwPoU doing each of the foflowinff with 
this dass. (Cirde one on each line. The total need not sum to 100%.) 

Percent 



None <10 10-24 25-49 50-74 75-100 



a. Providing instruction to the doss 
as a whole 

b. Providing instruction to small groups 
of students 

c. Providing instruction to individual 
students 

d. Maintaining order/disciplining students 

e. Administering tests or quizzes 

f . Performing routine administrative tasks 
(e.g., taking attendance, making 
announcements, etc.) 

g. Conducting lab periods 



2 
2 
2 



3 
3 
3 



4 
4 
4 



5 
5 
5 



00000 

00000 

00000 

00000 
00000 
00000 

00000 



Evaluation and Grading Practices 



15. On the tests , quizzes, and exams you administer to this class, about what percent of the items 
are of the following types? (Total should equal 100% in each colimm.) 



a. Multiple-choice 

b. Short-answer 

c . Essay 



Tests and Quizzes 

% 

% 

% 



Open-ended problems 

(i.e., where students generate their 

own solutions) 



Final Exam 

% 

% 

% 

. % 



00000 
00000 
00000 
00000 



Other (specify) 



% 



% 



00000 



rn^z- o™ I 9/17/92 



84 



16. On thA iP<tx and quizzftfi vnu adminktgr to this dass. about what pempnt of the items are of 
the fonowing types? (Total need 2IQt sum to 100%.) 



a . Items that require students to recognize or 
recall definitions or concepts 

b. Items that require the use of algorithms 
to solve problems 

c . Items that require students to describe 
how to solve problems 

d . Items that require students to explain 
their reasoning 

e. Items that require the application of concepts or 
principles to different or unfamiliar situations 

h. Items that require a critique or analysis 
of a suggested solution to a problem 

i . Other (specify) 



% 



% 



% 



% 



17. 



On the tests and ouiTzes vou adroinister to this class, about what pert?ent of the items are of 
the following types? (Total need not sum to 1009L) 



a . Exercises or problems that are minor variations 
of homework or class exercises o ■ problems 

b. Exercises or problems with more than one 
possible answer 

c . Exercises or problems with more than one 
possible approach 

d . Exercises or problems that require more than 
one step to reach a solution 

e . Items that require the use of tabular or graphical data 



9c 



18. \Vhat will be the approximate distribution of final student grades in this class? (Total 
shoidd equal number of students in the class.) 



A's 

B's, 

C's 

D's 

Fs 



00000 
00000 
00000 
00000 
00000 
00000 
00000 



00000 
00000 
00000 
00000 
00000 



00000 
00000 
00000 
00000 
00000 



Homework Policipj; a nd Prartires 



19. Approximately how much homework do you typically assign each day to this class? 
minutes 



COOOO 



Form 1 9/17/92 



85 

a 

20. How often do you do each of the following with homework assignments? 

Some Most All 
Never of the time of the time of the time 

a. Keep records of who did or who turned 1 2 3 4 00000 
in the assignment 

b. Return assignments with grades 1 2 3 4 00000 
or corrections 

c. Discuss the completed assignment in class 1 2 3 4 00000 



2 i . How frequently do you assign each of the following types of homewoz^? 



a . Reading the text or supplementary 
materials 

b. Doing exercises or problems from 
the text 

c. Doing exercises or problems from 
worksheets 

d. Writing definitions of concepts 

e. Applying concepts or principles to 
different or unfamiliar situations 

f . Solving problems for which there is 
no obvious method of solution 

g. Gathering data, conducting 
experiments, working on projects 

h . Preparing oral reports 

i . Preparing written reports 

j . Extending results established in 
class (e.g., deriving or proving 
new results) 

k . Keeping a journal 

1 . Solving applied problems 
(e.g., finding the amount of 
water needed to fill a pool) 

m . Explaining newspaper/magazine 
articles 



Almost Once or Once or Once or 
every twicea twicea twicea 
day week naonth semester Never 



3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 

3 4 5 



00000 

00000 

00000 

00000 
00000 

00000 

00000 

00000 
00000 
00000 

00000 
00000 

00000 



^orm I 9/17/92 



ERIC 



ID 



86 

22. How frequently do you use the materials and equipment listed below with this dass? 



Almost Onoeor Once or Once or 
evexy twioea twioea twicea 
day week 



a . Graph paper 

b. Protractors, rulers, or compasses 

c. A-V equipment (e.g., film 
projector, VCR, cassette, TV) 

d . Overhead projector 

f. Four-function calculator 

g. Scientific calculator 

h. Graphing calculator 

i . Other (specify) 



month semester 



2 
2 

2 
2 
2 
2 
2 



3 
3 

3 
3 
3 
3 
3 



4 
4 

4 
4 
4 
4 
4 



Never 

5 00000 

5 00000 

5 00000 



5 
5 
5 
5 
5 



00000 
00000 
00000 
00000 
00000 



1 1 J 



Q Form 1 9/17/92 

ERIC 



87 



Goals, Ol:r)ectives and Teacher Belie& 

23. Bow much eniphasis do you give to eadi of the fitjllowini; object 

Emphasis , 



No 



Minor Moderate Mjgor 



a. 


Understanding the nature of proof I 


2 


3 


4 


00000 


b. 


MeiTiorizinET facts rules and stens 1 


2 


3 


4 


00000 


c 


Learning to represent problem structures in 1 
multiple ways (e.g., graphically, algebraically, 
numerically) 


2 


3 


4 


00000 


d. 


Integrating different branches of mathematics 1 
(e.g., algebra, geometry) into a unified framework 


2 


3 


4 


00000 


e. 


Conceiving and analyzing the effectiveness of 1 
different approaches to problem solving 


2 


3 


4 


00000 


f. 


Performing calculations with speed 1 
and accuracy 


2 


3 


4 


00000 




Showing the importance of math in daily life 1 


2 


3 


4 


OOOCX) 


h. 


Solving equations 1 


2 


3 


4 


00000 


i. 


Raising questions and formulating conjectures 1 


2 


3 


4 


00000 


j- 


Increasing students* interest in math 1 


2 


3 


4 


00000 


k. 


Integrating math with other subjects 1 


2 


3 


4 


00000 


1. 


Finding examples and counterexamples 1 


2 


3 


4 


00000 


m. 


Judging the validity of arguments 1 


2 


3 


4 


00000 


n . 


Discovering generalizations 1 


2 


3 


4 


00000 


0. 


Representing and analyzing relationships 1 
using tables, charts and graphs 


2 


3 


4 


00000 


P- 


Applying mathematical models to real-world 1 
phenomena 


2 


3 


4 


00000 


q- 


Writing about mathematical ideas 1 


2 


3 


4 


00000 


r . 


Designing a study or experiment 1 


2 


3 


4 


00000 


s. 


Writing equations to represent relationships 1 


2 


3 


4 


00000 


t. 


Solving problems for which there is no 1 


2 


3 


4 


00000 



obvious method of solution 



t^^om I 9/17/92 

ERIC 



88 

Indicate the degree to whidi you emphasized the foUowisj: strategies with this class. 

Emphasis 

No Minor Moderate Major 



a . Students received a good deal of practice 
to become competent at mathematics. 

b. I routinely justified the mathematical 
principles and procedures used. 

c. I corrected student errors immediately. 

d . Students were provided frequent 
opportunities to discover mathematical 
ideas for themselves. 

e. I gave step-by-step directions for applying 
algorithms and procedures. 

f . Students were provided opportunities to apply 
mathematics to real-world situations. 

g . Students developed their own methods 
of solving math problems. 

h . Students were frequently expected to discover 
generalizations and principles on their own. 

i. Students learned to solve problems in 
different ways. 

j . Students were required to memonze and 
apply rules. 

k . Students learned there is usually a rule to 
apply when solving a math problem. 

1 . Students received step-by-step directionE 
to aid in solving problems. 



1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 

1 2 3 4 00000 



ERIC 



Form I 9/17/92 



89 



25. There are a variety of ways in which teachers describe their role in helping their students learn 

niathematics. Statements A throii^ D rsprcseai several possibilities. Please read these statements, 
then answer the question below about your role. 

A: "I mainly see my role as a facilitator. I try to provide opportunities and resources for my students 
to discover or construct mathematical concepts for themselves." 

B: "I think I need to provide more guidance than that Although I provide opportunities for them to 
discover concepts, I also try to lead my students to figure things out by asking pointed questions 
without telling them the answers." 

C: "I emphasize student discussion of math in my classroom. We talk about concepts and problems 
together, exploring the meaning and evaluating the reasoning that underlies different strategies. 
My role is to initiate and guide these discissions." 

D: "That's all nice, but my students really won't leam math unless you go over the material in a 
detailed and structured way. I think it's my job to explain, to show students how to do the work, 
and to give them practice doing it." 

Which statement best typifies your conception of your role in helping students in this class learn math? 
(Place an X on the continuum below to indicate your role.) 



26. Below are two pairs of statements. Each pair represents opposite ends of a contin uiim in curriculum 
approaches. After reading a pair of statements, place an X on the line between that pair indicating 
where you would place your approach with this dass. 



A 



B 



C 



D 



00000 



Pair 1: My primary goal is to help students 



A: learn mathematical 
terms, master computational 
skills and solve word problems 



B: achieve a deeper conceptual 
understanding of mathematics 



A 



B 



00000 



Pair 2: In this mathematics class, I aim for 



A: in-depth study of selected 
topics and issues^ even if it 
means sacrificing coverage 



B: comprehensive coverage 
even if it means sacrificing 
in-depth study 



A 



B 



00000 



Form I 9/17/92 



90 



The following questions concern the questionnaire itsel£ Please provide this information 
so that we mi^t improve the questionnaire for future use. 

27. Were any of the questions confusing or unclear? 
No 

Yes If Yes, please list the question number and describe the source of confusion. 

Number Source of Confusion 



28. Use the space below to describe any other problems or make any recommendations about the 
questionnaire. 



1 I i 

ERJ^C Po''" J 9/17/92 



Si. 



Naine_ 
Date_ 
1. 



Daily Log 



Coursc_ 
School 



List the content covered in this class period by briefly describing it or providing examples. 



TOPICS 



What modes of instruction did you use? (Check all that apply.) 

Lecture to entire class 
Demonstrate an excrdsc at the board 
Use manipulatives or audio*visual materials 

to demonstrate a concept 
Demonstrate an experiment 
Lead question and answer session 
Work with small groups 
Work with mdividual students 
Correct or review homework 

Other (please specify) 



3.. What activities did students engage in during this period? (Check all that apply.) 



Listen and take notes 
Work exercises at board 

Work mdividually on written assignments or worksheets 

Work with other students 

Work with manipulatives 

Use calculators 

Respond to questions 

Discuss topics from lesson 

Work on next da/s homework 

Work on computer 

Conduct lab experiment 

Write lab report 

Other (please specify) 



Comments: 



ERIC 



11 



